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VERTEBRATE HOMOLOGUES OP UNC-53 
PROTEIN OP C. ELEGANS 

The present invention relates to vertebrate 
5 homologues of UNC-53 protein of C. elegans and cDNA 
sequences coding for said homologues or functional 
equivalents thereof. The invention also relates to 
processes for identifying compounds which control cell 
behaviour, compounds identified and pharmaceutical 

10 compositions containing them in addition to processes 
and assays for identifying disease states in which 
said gene or protein is dysfunctional. 

The control of cell motility, cell shape and 
directionality of cell outgrowth of axones or other 

15 cell outgrowths is an essential feature in the 

morphogenesis and function of both unicellular and 
multicellular organisms- The control of these 
processes is disturbed in a variety of disease states 
in which, for example, the Receptor Tyrosine kinase 

20 (RTK) signal transduction pathways, or the like, or 
their downstream intra-cellular pathways (which are 
shared with other extra-cellular receptors, including 
cell adhesion molecules like N-CAMS and integrins) are 
overstimulated . 

25 Some cell surface proteins and extra-cellular 

molecules controlling the directionality and potential 
of cell migration have been identified, although the 
processes involved are not generally understood. 

It is generally considered that a long-range 

3 0 migration of a cell process (also known as a growth 

cone extension) is a stepwise event, whereby prior to 
and after each extension there is the formation of a 
structure at the leading edge of the cell which senses 
signals in the environment instructing the cell to 

35 either stabilise a cell process extending in a 
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preferred direction, or to cause a lamellipodium to 
extend a process in a given direction. Localised 
stabilisation of the actin cytoskeleton and 
association with plus end regions of microtubules is a 
5 general cell biological process underlying the choice 
of directional extension. Microtubule binding 
directing these processes has not previously been 
identified. The present inventors have surprisingly 
found that UNC-53 protein of C. elegans and vertebrate 

10 homologues thereof is involved in binding of 

microtubules and particularly of plus end regions of 
microtubules. 

A gene from the free-living nematode 
Caenorhabditis elegans designated "unc-53" has been 

15 previously identified and cloned (Abstract, 

International C. eleaans Meeting, June 1-5 1991, 
Madison, Wisconsin, 58, Bogaert and Goh) . The present 
inventors previously identified UNC-53 protein as a 
signal transducer or signal integrator controlling the 

20 directionality of cell migration and/or cell shape in 
C. elegans (WO 96/38555) . Increased UNC-53 protein 
activity was found to be proportional to cell process 
extensions in the correct direction of cell migration. 
The unc-53 gene was found to encode a signal 

25 transduction molecule that transduces a signal from an 
RTK such as, for example, via the adaptor protein SEM- 
5/GRB-2, to the machinery controlling directional 
growth cone extension or stabilisation, in a highly 
dosage - dependent fashion. 

30 Genetic and experimental analysis of C. eleaans 

UNC-53 mutants showed that mutations in the unc-53 
gene do not affect the general ability of cells to 
migrate but rather affect the ability of cells to 
migrate under specific antero-posterior cues. 

35 Reduction of UNC-53 activity leads to loss of 
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direction and reduction of growth cone extension as 
indicated by the directionality of random extension 
cycles observed in excretory canal growth cones in 
UNC-53 mutants . 

5 The function of UNC-53 is highly sensitive to its 

dosage or activity. Reduction of function leads to 
proportional reduction of migration to the specific 
signal while increased expression, using transgenic 
expression of UNC-53 in muscle cells, leads to 

10 increased directional migration. The data lead to the 
conclusion that UNC-53 functions as an integrator of a 
directional signal in the organism whereby reception 
of signals leads to growth cone extension in the 
correct direction. 

15 Certain alleles of UNC-53 enhance the sex 

myoblast migration defect of SEM-5 c. elegans mutants 
in a receptor tyrosine kinase signal transduction 
pathway (Stern et al 1993 mol. Biol, cell, 4, 1175- 
1188) . While the genetics suggests that UNC-53 and 

20 SEM-5 cooperate to regulate sex myoblast migration, 
genetic experiments do not permit a conclusion that 
this is the result of a direct molecular interaction. 
The inventors previously identified a potential sem- 
5/GRB-2 binding site and showed in two types of 

25 biochemical experiments that UNC-53 physically 

interacts with SEM-5. The present inventors conclude 
that UNC-53 encodes a signal transduction molecule 
that transduces extracellular signals for directional 
migration via the adapter protein SEM-5/GRB-2 to the 

30 machinery controlling directional growth cone 
extension or stabilization. 

Several lines of evidence indicate that UNC-53 
might act as an adapter linking extracellular signals 
to the actin cytoskeleton . Firstly, UNC-53 has shown 

35 homology to cortical actin binding proteins and that 
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it is capable of binding F-actin in vitro. In 
addition expression of UNC-53 in mammalian cells leads 
to changes in the F-actin cytoskeleton. Very low 
levels of UNC-53 expression increase the number of 
5 filopodia and actin microspikes protruding from the 
cell surface. Cells expressing UNC-53 also exhibit 
increased neurite extension and increased cell 
motility. UNC-53 thus also acts as an activator of 
migration. 

10 Considering all available data the following 

possible mechanisms of action of UNC-53 can be 
formulated. 

The choice and activation of directional growth 
cone extension can be accounted for by local 

15 activation of UNC-53 via a SEM-5/GRB2 complex to a 
receptor (eg receptor tyrosine kinase signal) which 
reads a localized or directional signal. Changes in 
growth cone steering are preceded by the formation of 
a localized actin patch in the area of the growth cone 

20 receiving the highest signal (Bentley and O'Connor et 
al. Curr. Op. NeuroBiol. 1994, vol 4, 43-48). 
UNC-53 might be directly involved in forming these 
actin patches through its own actin binding or cross- 
linking properties. Alternatively activated UNC-53 

25 may (eg via its nucleotide binding domain) transduce a 
signal to as yet unidentified effectors. For example, 
activation of the small GTP-binding protein cdc42 or a 
related protein leads to formation of small actin 
patches as well as the formation of small filopodia. 

30 The unc-53 pathway may be upstream of cdc42 or both 
signal transducers might share downstream pathways. 

The present inventors thus decided to investigate 
if a similar protein was present in higher organisms 
such as vertebrates. 

35 The present inventors describe the identification 
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of a family of genes in vertebrates, and particularly 
in man and mouse with extensive structural homology to 
UNC-53. The present inventors have surprisingly found 
that the nucleotide domains of UNC-5 3 from C. elegans 
5 and UNC-53 from vertebrates similarly activate 
motility, establishing functional equivalence. 
Furthermore these domains are shown to be capable of 
transforming NIH3T3 cells in vitro. The inventors 
also found changes in RNA transcripts in transformed 

10 cell lines compared to normal human tissues suggesting 
a role for UNC-53 in cell differentiation, 
morphogenesis and disease. Furthermore, in vitro 
assays and transgenic models are also described that 
identify pharmacological modulators of UNC-53 activity 

15 and assays to identify proteins interacting with UNC- 
53. 

According to a first aspect of the present 
invention, there is provided a vertebrate protein 
homologue of UNC-53 protein of C. eleaans or a 

20 functional equivalent, derivative or bioprecursor 

thereof, which protein homologue comprises an amino 
acid sequence having a statistically significant 
homology to the UNC-53 protein of C. eleaans as 
illustrated in figure 2. According to the present 

25 invention a derivative should be taken to mean 

mutational derivatives, fusions, internal deletions, 
splice variants and muteins. 

There is also provided according to a second 
aspect of the present invention a vertebrate protein 

30 homologue of UNC-53 protein of C. elegans , which 

protein comprises an amino acid sequence having one or 
more of sequence homology blocks A, B, C, D or E as 
illustrated in Figure 9a, or block F in Figure 12a or 
a sequence having a statistically significant homology 

35 therewith. 
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Preferably, said vertebrate homologue is a human 
protein or a mouse protein • 

According to a further aspect of the invention 
there is provided a vertebrate protein homologue of an 
5 UNC-53 protein of C . eleaans . which protein comprises 
an amino acid sequence having one or more of sequence 
blocks A, B, C, D,E or F which differ from those 
blocks of Figure 9a and Figure 12a to a significant 
extent only in conservative amino acid changes. In an 

10 even further aspect of the invention there is provided 
a vertebrate protein having an amino acid sequence 
encoded by the nucleotide sequence from position 1 to 
position 6013 as illustrated in Figure 9b. There is 
also provided a vertebrate protein having an amino 

15 acid sequence encoded by the nucleotide sequence 

illustrated in Figure lid, or a functional equivalent 
derivative, or bioprecursor of said homologue. 

According to a further aspect of the present 
invention there is provided a vertebrate protein 

20 having an amino acid sequence corresponding to the 
prosite signatures as illustrated in Figure 28 for 
each of said homology blocks as defined above. 
Advantageously the prosite signatures can be used to 
identify a protein having a statistically significant 

25 homology to the UNC-53 protein of C. eleaans . (Luethy 
et al 1994, Protein Science, 3, 139-146). 

A further aspect of the invention comprises a 
vertebrate homologue according to the invention 
comprising an amino acid sequence as shown in figure 

30 9b or lid or an amino acid sequence which differs from 
the amino acid sequences shown in these figures to a 
significant extent only in one or more conservative 
amino acid changes. 

In a further aspect of the present invention 

35 there is also provided a nucleic acid molecule, which 
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is preferably DNA , and which encodes a vertebrate 
homologue of UNC-53 protein of C. eleaans , or a 
functional equivalent derivative, fragment or 
bioprecursor of said homologue according to the 
5 invention. Preferably, the cDNA comprises a sequence 
of nucleotides encoding an amino acid sequence as 
illustrated in figures 9b or lid or an amino acid 
which differs from the sequences shown in these 
figures to a significant only in one or more 

10 conservative amino acid changes. Preferably the DNA is 
cDNA, which cDNA comprises at least from position 1 
to 6013 of the sequence shown in Figure 9b* 
Alternatively the cDNA may comprise the sequence 
illustrated in Figure lid. Also provided by the 

15 present invention is a nucleic acid sequence capable 
of hybridising to the nucleic acid or DNA sequences 
according to the invention under high strigency 
conditions, which conditions are well known to those 
skilled in the art. 

20 The cDNA according to the invention may be 

included in an expression vector which may itself be 
used to transform or transfect a host cell, which cell 
may be bacterial or eukaryotic in origin including 
such as, for example an animal or plant cell a fungal 

25 cell or an insect cell. Thus, advantageously, once 

the cDNA corresponding to the genome of the vertebrate 
homologue of UNC-53 of c. eleaans is synthesised, 
using for example, reverse transcriptase or the like, 
a range of cells, tissues or organisms may be 

30 transfected following incorporation of the selected 
cDNA clone into an appropriate expression vector. The 
expression vector according to the invention may 
comprise a promoter of C. elegans or one of human 
mouse or viral origin and optionally a sequence 

35 encoding a reporter molecule, such as, for example, 
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green fluorescent protein. 

The present invention, therefore, also further 
comprises a transgenic cell, tissue or organism 
comprising a transgene capable of expressing a 
5 vertebrate homologue of UNC-53 protein of C. eleaans 
or a functional equivalent, fragment derivative or 
bioprecursor of said homologue. The term "transgene 
capable of expressing a vertebrate homologue of UNC-53 
protein of C. eleaans " as used herein means a suitable 

10 nucleic acid sequence which leads to the expression of 
a vertebrate homologue of UNC-53 protein of C. elegans 
having the same function and/or activity. The 
transgene may include, for example, genomic nucleic 
acid isolated from the appropriate vertebrate or 

15 synthetic nucleic acid including cDNA. The term 

"transgenic organism, tissue or cell, as used herein 
means any suitable organism and/or part of an 
organism, tissue or cell, that contains exogenous 
nucleic acid either stably integrated in the genome or 

20 in an extrachromosomal state. 

Preferably the transgenic cell comprises any of, 
a COS cell, HepG2 cell, MCF-7 or N4 neuroblastoma cell 
or a NIH3T3 cell or a colorectal or carcinoma cell or 
a human derived cell such as a fibroblast or the like. 

25 The transgenic organism may be an insect, a non-human 
animal or a plant and preferably c. elegans or a 
related nematode. Preferably, the transgene comprises 
the nucleic acid sequence encoding the vertebrate 
homologue or a functional fragment of said gene 

30 according to the invention as described above. The 
transgene preferably comprises an expression vector 
according to the invention. 

The term functional fragment" as used herein 
should be taken to mean a fragment of the gene coding 

35 for the vertebrate homologue of the UNC-53 protein of 
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c. elegans or a functional equivalent or derivative or 
bioprecursor of said protein. For example, the gene 
may comprise deletions or mutations but may still 
encode a functional vertebrate homologue of UNC-53 
5 protein. 

Further provided by the present invention is a 
method of producing a mutant vertebrate non-human 
organism or cell having a mutation in the wild-type 
gene coding for the vertebrate homologue of UNC-53 

10 protein, which mutation affects cell behaviour or the 
regulation of cell motility or the shape or the 
direction of cell migration or microtubule plus end 
stability or function and localisation of protein 
complexes located thereon, which method comprises 

15 inducing a mutation in the vertebrate homologue of 

UNC-53 protein in said organism or cell. These mutant 
organisms or cells may be used in a screen to identify 
the effects of compounds on these cell functions. 

The vertebrate homologue of UNC-53 protein of 

20 C. eleaans or the cDNA or genomic DNA encoding it or a 
functional equivalent, derivative, fragment or 
bioprecursor of said homologue, may advantageously be 
used as a medicament, or in the preparation of a 
medicament to promote neuronal regeneration, 

25 revascularisation or wound healing or the treatment of 
chronic neurodegenerative disorders or acute traumatic 
injuries or fibrotic disease or physiological events 
requiring the polarity of cells or epithelia. The 
present inventors have also found that the vertebrate 

30 homologue of UNC-53 protein plays a role in a 
transformed state of cells. Accordingly, the 
vertebrate homologue, dominant positive or negative 
mutants thereof, or inhibitors thereof may 
advantageously be used to induce or alleviate contact 

35 inhibition in a cell or in preventing cancer 
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development. Typically, the above medical conditions 
may be treated in mammals and more preferably humans 
by either a homologue of UNC-53 protein or 
alternatively by a nucleic acid coding for such a 
5 protein. Alternatively an antisense oligonucleotide 
to said UNC-53 homologue may be used to prevent it's 
expression. Examples of other nucleic acid sequences 
which may be used include 3' untranslated regions of 
mRNA which could be used to prevent transcription of 

10 the genomic sequence encoding for the vertebrate 
homologue of UNC-53 protein. 

The vertebrate homologue of UNC-53 protein or a 
functional equivalent , fragment or bioprecursor of 
said protein may be incorporated into a 

15 pharmaceutical^ acceptable composition together with 
a suitable carrier, diluent or excipient therefor. 
The pharmaceutical composition may advantageously 
comprise, additionally or alternatively, the nucleic 
acid sequence according to the invention as defined 

2 0 above. 

The present invention also provides for a method 
of determining whether a compound is an inhibitor or 
enhancer of the regulation of cell behaviour, growth, 
transformation, cell shape or motility or the 

25 direction of cell migration or microtubule plus end 
stability or function and localisation of protein 
complexes thereon which method comprises contacting 
said compound with a transgenic cell according to the 
invention and screening for a phenotypic change in 

30 said cell. Preferably the method can determine 

whether the compound comprises an inhibitor or an 
enhancer of the signal transduction pathway of said 
transgenic cell of which pathway said vertebrate 
homologue of UNC-53 protein, or a functional 

35 equivalent, derivative, fragment or bioprecursor of 
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said vertebrate homologue is a component or whether 
said compound is an inhibitor or an enhancer of a 
parallel or redundant signal transduction pathway in 
said cell. The present invention also provides a 
5 method to determine that the protein in said signal 

transduction pathway is a vertebrate homologue of UNC- 
53 protein of C. elegans or a functional equivalent , 
fragment, derivative or bioprecursor of said 
vertebrate homologue. 

10 Preferably, the phenotypic change to be screened 

comprises a change in cell shape or a change in cell 
motility. Where a transgenic cell is used in 
accordance with one embodiment of the method of the 
invention, an N4 neuroblastoma cell may be used and in 

15 such an embodiment the phenotypic change to be 
screened may be the length of neurite growth or 
changes in filipodia outgrowth or alternatively 
changes in ruffling behaviour or cell adhesion or any 
change in microtubule cytoskeleton or any change in 

20 localisation of proteins on plus end regions of 

microtubules or any change in cell death such as in 
apoptosis. In an alternative embodiment of the method 
of the invention, the transgenic cell may comprise an 
MCF-7 breast cancer cell. Typically in such an 

25 embodiment the phenotypic change to be screened 

comprises the extent of phagokinesis or filipodia 
formation. In an alternative embodiment of this 
aspect of the invention, the transgenic cell may 
comprise an NIH3T3 cell. Typically in such an 

30 embodiment the phenotypic change to be screened 
comprises loss of contact inhibition of foci 
formation. The method according to the invention, may 
also utilise a mutant cell or mutant organism 
according to the invention as described above, where 

35 the mutant cell is capable of growing in tissue 
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culture or in vivo and either of which cell or 
organism has a mutation in the wild-type unc-53 gene. 

In accordance with the present invention , a 
"phenotypic change", may be any phenotype resulting 
5 from changes at any suitable point in the life cycle 
of the cell, tissue or organism defined above, which 
change can be attributed to the expression of the 
transgene such as for example, growth, viability, 
morphology, behaviour, movement, cell migration or 

10 cell process or growth cone extension of cells and 
includes changes in body, shape, locomotion, 
chemotaxis, contact inhibition, mating behaviour or 
the like. The phenotypic change may preferably be 
monitored directly by visual inspection of the cell as 

15 a whole or particularly by monitoring the F-actin 
cytoskeleton microtubule network and plus end 
stability of microtubules or proteins thereon or 
alternatively by for example measuring indicators of 
viability including endogenous or transgenically 

20 introduced histochemical markers or other reporter 
genes, such as for example G-galactosidase or green 
fluorescent protein. 

A compound which is identifiable by the method 
according to the invention as described above, as an 

25 enhancer of the processes identified above such as the 
regulation of cell shape or motility or the direction 
of cell migration may be used as a medicament, or 
alternatively in the preparation of a medicament, for 
promoting neuronal regeneration, revascularisation or 

30 wound healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 
fibrotic disease. Examples of promoting neuronal 
regeneration include, for example, peripheral nerve 
regeneration after trauma and spinal cord trauma. 

35 Where a compound is identified in accordance with 
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the method described above as being an inhibitor of 
the regulation of cell shape etc., the compound may be 
used as a medicament, or in the preparation of a 
medicament, for substantially alleviating spread of 
5 disease inducing cells, such as in spread of cancer, 
or the like in metastasis or in alleviating loss of 
contact inhibition. Advantageously, any of the 
compounds which may have been identified as an 
inhibitor or an enhancer in accordance with the method 

10 as described above, may also be included in a 

pharmaceutical composition comprising the respective 
compound and a pharmaceutical^ acceptable carrier, 
diluent or excipient therefor. 

The particular mechanism of action of a compound 

15 identified as either an inhibitor or an enhancer of 
the cell motility shape, growth or direction of cell 
migration or microtubule association or to the plus 
end region thereof is not limiting. Preferably the 
compound acts as an inhibitor or enhancer of a signal 

20 transduction pathway. The compound may also act on a 
parallel pathway or directly on the vertebrate 
homologue of UNC-53 protein of C . elegans . For 
example, the method of action of the compound may 
include direct interaction with the vertebrate 

25 homologue of UNC-53 protein, interaction with 
processes for regulating phosphorylation or 
dephosphorylation of the vertebrate homologue of UNC- 
53 or with processes regulating activity of an unc-53 
gene or with processes for post-transcriptional or 

30 post-translational modification or the like. 

Preferably the compound is identified by the 
method according to the invention as an inhibitor or 
an enhancer, by utilising differences of phenotype of 
the cell, tissue or organism, which are visible to the 

35 eye. Alternatively indicators of viability including 
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endogenous or transgenically introduced histochemical 
markers or a reporter gene may be used. 

According to a further aspect of the invention 
there is also provided a transgenic cell or tissue 
5 culture which has been constructed to comprise a 

promoter sequence of a gene coding for a vertebrate 
homologue of UNC-53 of eleaans or a functional 
equivalent , derivative fragment, or bioprecursor of 
said homologue operably linked to a nucleic acid 

10 sequence encoding a reporter molecule. Preferably, 
the reporter sequence encoding the reporter molecule 
which comprises a detectable protein, for example one 
which may be monitored by eye inspection such as 
antibiotic resistance, fc-galactosidase or a molecule 

15 detectable by spectrophotometric , spectrof luorometric, 
luminescent or radioactive assays. 

The present invention also provides a method of 
determining whether a compound is an inhibitor or an 
enhancer of transcription of a gene coding for a 

20 vertebrate homologue of UNC-53 protein in C. eleaans , 
or a functional equivalent, derivative fragment or 
bioprecursor of said homologue, which method comprises 
the steps of: 

(a) contacting said compound with a transgenic 
25 cell according to the invention as described 

above , 

(b) monitoring the level of said reporter 
molecule and comparing results obtained from this 
monitoring step with a control comprising a 

30 transgenic cell having the promoter sequence of a 

gene coding for a vertebrate homologue of UNC-53 
protein, or a functional fragment of said 
homologue and the reporter molecule, in the 
absence of the compound. 

35 in one embodiment of the method according to this 
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aspect of the invention the reporter molecule may 
comprise messenger RNA. 

A compound identified as an enhancer of 
transcription of the gene coding for the vertebrate 
5 homologue of UNC-53 protein of c. eleaans or a 

functional equivalent derivative or bioprecursor of 
said homologue may also be used as a medicament, or in 
the preparation of a medicament, for promoting 
neuronal regeneration, revascularisation or wound 

10 healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 
fibrotic disease. Furthermore, such compounds may be 
included in a pharmaceutical composition including a 
pharmaceutical^ acceptable carrier, diluent or 

15 excipient therefor. Any compounds identified as 

inhibitors of transcription may, advantageously, be 
used in alleviating the spread of disease inducing 
cells such as cancers or metastasis or loss of contact 
inhibition. 

20 The present invention also provides a kit for 

determining whether a compound is an enhancer or an 
inhibitor of the regulation of cell growth, 
transformation, cell motility or shape or the 
direction of cell migration which kit comprises at 

25 least one transgenic or mutant cell or transgenic or 
mutant non-human organism according to the invention 
as described above and a plurality of wild-type cells 
or one organism of the same type, or a cell line or 
tissue culture and means for contacting said compound 

30 with said cell or organism. 

Also provided by the present invention is a kit 
for determining whether a compound is an inhibitor or 
an enhancer of transcription of a gene coding for a 
vertebrate homologue of UNC-53 protein of c. elegans 

35 or a functional equivalent, derivative or fragment 
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thereof, which kit comprises at least one transgenic 
cell or cells according to the invention and means for 
contacting said compounds with said cells. 

For the purposes of the present invention, the 
5 term "gene coding for a vertebrate homologue of UNC-53 
or a functional fragment of said homologue" includes 
the nucleic acid sequence shown in Figures 9b or lid 
or a fragment thereof, including the differentially 
spliced isoforms and transcriptional starts of the 

10 nucleic acid sequence and which sequence encodes a 

vertebrate homologue of UNC-53 protein or a functional 
equivalent, derivative, fragment or bioprecursor of 
the protein. 

The present invention also provides methods of 

15 identifying genes of vertebrates or fragments of said 
genes, which encode proteins which are active in the 
signal transduction pathway of which the vertebrate 
homologue of UNC-53 is a component* A preferred 
method comprises hybridizing to an appropriate cDNA 

20 library a nucleotide sequence, as defined herein, or a 
fragment thereof under appropriate conditions of 
stringency in order to identify genes having 
statistically significant homology with the cDNA 
clones of any one of the cDNA sequences according to 

25 the invention described above. 

Furthermore, there is also provided by the 
present invention a method of identifying a protein 
which is active in the signal transduction pathway of 
a cell of which a vertebrate homologue of UNC-53 

30 protein of C. eleaans or a functional equivalent, 

fragment or bioprecursor of said vertebrate homologue 
is a component. According to this aspect of the 
invention, the method comprises; 

(a) contacting an extract of said cell with an 

35 antibody to the vertebrate homologue of UNC-53 




WO 98/24810 



PCT/EP97/06956 



- 17 - 



10 



15 



20 



25 



30 



protein or a functional equivalent, fragment or 
bioprecursor of said protein, 

(b) identifying the antibody/ vertebrate 
homologue of UNC-53 complex, and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein other than the antibody. 

The vertebrate homologue of UNC-53 protein, 
therefore may bind regions of other proteins involved 
in the signal transduction pathway. It is also 
possible to sequentially identify a whole range of 
proteins involved in the signal transduction pathway. 

Antibodies to the vertebrate homologue of UNC-53 
protein may be produced according to known techniques 
as would be known to those skilled in the art. For 
example, polyclonal antibodies may be prepared by 
inoculating a host animal, such as a mouse, with a 
protein or epitope of a protein according to the 
invention and recovering immune serum. 

This aspect of the invention further comprises a 
method of identifying a further protein or proteins 
which are active in the signal transduction pathway of 
a cell of which UNC-53 is a component which method 
comprises : 

(a) forming an antibody to the first identified 
protein bound to the vertebrate homologue of 
UNC-53 protein in the method as described above, 

(b) contacting a cell extract with the antibody, 

(c) identifying the antibody/protein complex, 

(d) analysing the complex to identify any 
further protein bound to the first protein other 
than the antibody, and 

(e) optionally repeating steps (a) to (d) to 
identify further proteins in the pathway. 
According to this aspect of the present 
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invention, the antibody starts the process by binding 
to the vertebrate homologue of UNC-53 protein or a 
functional equivalent thereof in the signal 
transduction pathway. Any other proteins found 
complexed to the bound antibody or 

UNC-53 protein can then be used to identify further 
interacting proteins involved in the pathway. 

It may also be possible to identify proteins 
involved in the signal transduction pathway of a cell 
of which the vertebrate homologue of UNC-53 or a 
functional equivalent derivative or bioprecursor 
thereof is a component by using a vertebrate homologue 
of UNC-53 protein of C. eleaans . According to this 
aspect of the invention the method comprises: 

(a) contacting an extract of the cell with the 
vertebrate homologue of UNC-53 protein of 
c. eleaans or a functional equivalent, 
fragment or bioprecursor of said homologue, 

(b) identifying the vertebrate homologue of 
UNC-53 protein/protein complex formed and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein other than the same 
vertebrate homologue of UNC-53 protein 

This method can also advantageously be used to 
identify further proteins in a signal transduction 
pathway of a cell by contacting an extract of the cell 
used as described above, with any protein identified 
from step (c) above not being a vertebrate homologue 
of UNC-53 protein and repeating steps (b) and (c) . 

Other methods which may be used for identifying 
proteins in a signal transduction pathway of a cell 
may comprise for example a western blot overlay method 
which method is well known to those skilled in the 
art. Cell extracts are run on gels to separate out 
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protein and subsequently blotted onto a nylon 
membrane. These membranes may then be incubated , for 
example in a medium containing a vertebrate homologue 
of UNC-53 having a label attached thereto such as a 
5 biotin or radiolabel and any protein conjugates 

visualised with for example a streptavidin or alkaline 
phosphatase conjugated antibody. 

The present invention also advantageously 
provides a process for the preparation of binding 

10 antibodies which recognise proteins or fragments 

thereof involved in the rate and direction of cell 
migration or the control of cell growth or shape, for 
the above methods. 

The monoclonal antibody for binding to the 

15 appropriate vertebrate homologue of UNC-53 (or its 
functional equivalent) may be prepared by known 
techniques as described by Kohler R. and Milstein C. , 
(1975) Nature 256, 495 to 497. 

Another method which may be used to identify 

20 proteins involved in the signal transduction pathway 

of a cell of which a vertebrate homologue of an UNC-53 
protein of C. eleaans or a functional equivalent or 
derivative or bioprescursor is a component involves 
investigating protein-protein interactions using the 

25 two-hybrid vector method. This method is well known 
to those skilled in the art and which was first 
developed in yeast by Chien et ai (1991). This 
technique is based on functional reconstitution in 
vivo of a transcription factor which activates a 

30 reporter gene. More particularly the technique 

comprises providing an appropriate host cell with a 
DNA construct comprising a reporter gene under the 
control of a promoter regulated by a transcription 
factor having a DNA binding domain and an activating 

35 domain, expressing in the host cell a first hybrid DNA 
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sequence encoding a first fusion of a fragment or all 
of a nucleic acid sequence according to the invention 
and either said DNA binding domain or said activating 
domain of the transcription factor, expressing in the 
5 host at least one second hybrid DNA sequence, such as 
a library or the like, encoding putative binding 
proteins to be investigated together with the DNA 
binding or activating domain of the transcription 
factor which is not incorporated in the first fusion; 

10 detecting any binding of the proteins to be 

investigated with a protein according to the invention 
by detecting for the presence of any reporter gene 
product in the host cell; optionally isolating second 
hybrid DNA sequences encoding the binding protein. 

15 An example of such a technique utilises the GAL4 

protein in yeast. GAL4 is a transcriptional activator 
of galactose metabolism in yeast and has a separate 
domain for binding to activators upstream of the 
galactose metabolising genes as well as a protein 

20 binding domain. Nucleotide vectors may be 

constructed, one of which comprises the nucleotide 
residues encoding the DNA binding domain of GAL4 . 
These binding domain residues may be fused to a known 
protein encoding sequence, such as for example a 

2 5 sequence coding for the vertebrate homologue of 
UNC-53. The other vector comprises the residues 
encoding the protein binding domain of GAL4 . These 
residues are fused to residues encoding a test 
protein, preferably from the signal transduction 

30 pathway of the vertebrate in question. Any interaction 
between the vertebrate homologue of UNC-53 protein and 
the protein to be tested leads to transcriptional 
activation of a reporter molecule in a GAL— 4 
transcription deficient yeast cell into which the 

35 vectors have been transformed. Preferably, a reporter 
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molecule such as 6-galactosidase is activated upon 
restoration of transcription of the yeast galactose 
metabolism genes. This method enables any 
interactions between proteins involved in the signal 
5 transduction pathway or a parallel or redundant 
pathway to be investigated. 

Any proteins identified in the signal 
transduction pathway of the cell, which may be for 
example a mammalian cell, may also be included in a 

10 pharmaceutical composition together with a 

pharmaceutical^ acceptable carrier, diluent or 
excipient therefor. 

The present invention also provides a process for 
producing a vertebrate homologue of an UNC-53 protein 

15 of C. eledans or a functional equivalent, fragment, or 
derivative of the protein, which process comprises 
culturing the cells transformed or transfected with a 
cDNA expression vector having any of the cDNA 
sequences according to the invention as described 

20 above, and recovering the expressed vertebrate 
homologue of UNC-53 protein. The cell may 
advantageously be a bacterial, animal, insect or plant 
cell . 

A particularly preferred process for producing a 
25 vertebrate homologue of UNC-53 protein or a functional 
equivalent, derivative or fragment of said homologue 
comprises using insect cells. Accordingly, the 
invention provides a process for producing a 
vertebrate homologue of UNC-53 protein of C. eleaans 
30 or a functional equivalent, fragment, derivative or 
bioprecursor of the UNC-53 protein, which process 
comprises culturing an insect cell transfected with a 
recombinant Baculovirus vector, said vector comprising 
a nucleotide vector encoding the vertebrate homologue 
35 of UNC-53 protein or a functional equivalent, fragment 
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or bioprecursor thereof downstream of the Baculovirus 
polyhedrin promoter and recovering the expressed 
protein. Advantageously, this method produces large 
amounts of protein for recovery. The insect cell may 
5 be from for example Spodoptera frugiperfla or 
Drosophila Melanoaester. 

In accordance with the present invention, a 
defined nucleic acid sequence includes not only the 
identical nucleic acid but also any minor base 

10 variations from the natural nucleic acid sequence 

including in particular, substitutions in bases which 
result in a synonymous codon (a different codon 
specifying the same amino acid) , due to the degenerate 
code in conservative amino acid substitution. The 

15 term "nucleic acid sequence" also includes the 

complimentary sequence to any single stranded sequence 
given which includes the definition above regarding 
base variations. 

Furthermore, a defined protein, polypeptide or 

20 amino acid sequence according to the invention, 

includes not only the identical amino acid sequence 
but also minor amino acid variations from the natural 
amino acid sequence including conservative amino acid 
replacements (a replacement by an amino acid that is 

25 related in its side chains) . Also included are amino 
acid sequences which vary from the natural amino acid 
but result in a polypeptide which is immunologically 
identical or similar to the polypeptide encoded by the 
naturally occurring sequence. Such polypeptides may 

30 be encoded by a corresponding nucleic acid sequence. 

A further aspect of the invention provides a 
nucleic acid sequence of at least 15 nucleotides of a 
nucleic acid according to the invention and preferably 
from 15 to 50 nucleotides. 

35 These sequences may, advantageously be used as 
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probes or primers to initiate replication or the like. 
Such nucleic acid sequences may be produced according 
to techniques well known in the art, such as by 
recombinant or synthetic means. They may also be used 
5 in diagnostic kits or the like for detecting for the 

presence of a nucleic acid according to the invention. 
These tests generally comprise contacting the probe 
with a sample under hybridising conditions and 
detecting for the presence of any duplex formation 

10 between the probe and any nucleic acid in the sample. 

Nucleic acid sequences according to the invention may 
also be produced using recombinant or synthetic means 
such as described in Sambrook et al ( Molecular 
Cloning: A Laboratory Manual, 1989) .Advantageously, 

15 human allelic variants or polymorphisms of the DNA 

according to the invention may be identified by, for 
example, probing DNA libraries from a range of 
individuals for example from different populations. 
Furthermore, nucleic acids and probes according to the 

2 0 invention may be used to sequence genomic DNA from 

patients using techniques well known in the art, such 
as the Sanger Dideoxy chain termination method, which 
may advantageously ascertain any predisposition of a 
patient to certain proliferative disorders. 

25 A method of detecting whether a compound is an 

inhibitor or an enhancer of expression of a vertebrate 
homologue of UNC-53 of C. eleaans f or a functional 
equivalent, derivative or fragment of said vertebrate 
homologue is also provided which method comprises 

30 contacting a cell expressing said homologue with said 
compound and monitoring for a phenotypic change 
compared to a control cell which has not been 
contacted with said compound - 

Preferably the cell is a transgenic cell as 

35 described above. Alternatively the cell may have 
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undergone loss of contact inhibition. 

The present method also provides for determining 
whether said compound is an inhibitor of expression of 
said vertebrate homologue. In one embodiment the 
5 compound to be tested comprises a nucleic acid. 

Preferably said nucleic acid sequence comprises 
an antisense DNA sequence or a mRNA sequence* 

Preferably said mRNA sequence comprises 3 1 
untranslated regions of mRNA encoding for said 

10 vertebrate homologue. 

Alternatively , the compound to be tested may be a 
protein. Preferably, said protein comprises a protein 
having an amino acid sequence potentially suitable for 
inhibiting function of said vertebrate homologue and 

15 preferably comprises a protein identified by the 
methods as described herein. 

The present invention also provides a 
pharmaceutical composition comprising a compound, for 
example an antisense nucleic acid identified according 

20 to the above described method together with a 

pharmaceutically acceptable carrier, diluent or 
excipient therefor. 

A nucleic acid sequence or protein identified 
according to this aspect of the invention may be used 

25 as a mediciament, or in the preparation of a 

medicament, for treating loss of contact inhibition or 
cancer which is mediated by a vertebrate homologue of 
UNC-53 protein or a functional equivalent, fragment, 
derivative or bioprecursor of said homologue. 

30 Further provided by the invention is a nucleic 

acid as defined above for use in preparation of a 
medicament for inhibiting expression of a gene coding 
for a vertebrate homologue of UNC-53 protein of 
C. elegans or a functional equivalent, derivative, 

35 fragment or bioprecursor of said homologue. 
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According to a further aspect of the invention 
there is provided a plasmid pCB2 01 deposited under 
LMBP Accession No. LMBP 3594 and a MCF-7 and a NIH/3T3 
cell line transfected with plasmid pCB201 deposited 
5 under LMBP Accession Nos. LMBP 1601 CB and LMBP 1603 

CB respectively. Further provided by the invention is 
phage lambda 3b coding for Hu-UNC-53/1 and deposited 
under Accession No. LMBP 1604CB (or 3595) . Also 
provided are plasmids pLMl deposited under Accession 

10 No. LMBP 3762, pLM4 (LMBP 3763), pEGFP72 (LMBP 3764) 
and pCBSOl (LMBP 3765) . Further provided is a Bac 
clone comprising a fragment of hu-unc-53/2 gene (LMBP 
3773) and a worm strain comprising a chimeric 
C.elegans human unc53 gene deposited under LMBP 

15 Accession No. LMBP-1663CB. 

Further provided by the invention is an assay for 
detecting expression of a vertebrate homologue of 
UNC-53 protein of c. eleaans in a vertebrate cell 
which assay comprises contacting a cell or an extract 

20 thereof with an antibody to said vertebrate homologue, 
or a functional equivalent, derivative or bioprecursor 
thereof, which antibody is fused to a reporter 
molecule, removing any unbound antibody and monitoring 
for the presence of said reporter molecule. 

25 Preferably the reporter molecule is an antibody 

conjugated to for example a flurophore such as 
fluorescein or alternatively to an enzyme such as 
strepavidin. 

There is also provided a method for detecting for 
30 expression of a gene coding for a vertebrate homologue 
of UNC-53 protein or a functional equivalent, 
derivative, fragment or bioprecursor thereof, which 
method comprises contacting a probe specific for a 
nucleic acid or protein sequence coding for or 
35 corresponding to said vertebrate homologue or a 
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functional equivalent, fragment or bioprecursor 
thereof with a cell extract, which probe is linked to 
a reporter and analysing for the presence of said 
reporter. 

5 Preferably the probe is a complementary sequence 

to a region of mRNA transcribed from said gene 
encoding said vertebrate homologue of UNC-53 protein 
or a functional equivalent, derivative or bioprecursor 
therefor. 

10 Preferably the complimentary sequence is a 3' or 

5 1 untranslated region of said mRNA. Preferably said 
reporter may be a dig label, a fluorophore, a hapten 
or a radiolabel. 

Alternatively said probe comprises an antibody 

15 specific for said vertebrate homologue of said UNC-53 
protein or a functional equivalent, derivative, 
fragment or bioprecursor therefor. 

Preferably the reporter is an antibody conjugated 
to for example a fluorophore such as fluorscein or 

20 alternatively an enzyme such as streptavidin. 

As described above UNC-53 protein of C.elegans 
has been found to localise to microtubule and 
particularly to microtubule ( + ) ends. Therefore, 
there is provided by a further aspect of the present 

25 invention a method of determining whether a compound 
is an inhibitor or an enhancer of association of UNC- 
53 or a vertebrate homologue thereof according to any 
of claims to 1 to 9 to microtubules or plus end 
regions thereof, which method comprises (a) contacting 

30 said compound with a transgenic cell, tissue or 

organism expressing UNC-53 protein or said vertebrate 
homologue and which protein is operably linked to a 
reporter molecule (b) screening for the localisation 
of said reporter molecule as compared to a cell 

35 according to step (a) which has not been contacted 
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with said compound. 

A compound identifiable by the above method also 
forms part of the present invention. Such a compound 
identified as an inhibitor of localisation or 
5 association of UNC-53 or said vertebrate homologue 

with microtubules or the plus end region thereof may 
be used in alleviating the spread of disease inducing 
cells or metastasis or loss of contact inhibition. 
Further a compound identified as an enhancer of 

10 association of UNC-53 or said vertebrate homologue 

with microtubules or the plus end region thereof may 
be used in for example promoting neuronal 
regeneration, revascularisation or wound healing, or 
for treating chronic neurodegenerative diseases or 

15 acute traumatic injuries or fibrotic disease. These 
compounds may then be included in a pharmaceutical 
composition, together with a pharmaceutically 
acceptable carrier, diluent or excipient therefor. 

Also provided by the present invention is a kit 

20 for determining whether a compound is an inhibitor or 
an enhancer of association of UNC-53 or a vertebrate 
homologue thereof according to the invention with 
microtubules or the plus end regions thereof, which 
kit comprises at least one transgenic cell expressing 

25 UNC-53 and a reporter molecule or a host or transgenic 
cell according to the invention and at least one cell 
of the same cell type for use as a control and means 
for contacting said compound with one of said at least 
one transgenic cells. Compounds identified as 

30 inhibitors or enhancers or microtubule association 
described above may advantageously be included in a 
composition and linked to unc-53 protein of C.elegans 
or a vertebrate homologue thereof according to the 
invention to target the compounds to the microtubules 

35 or the plus end regions thereof. Such a composition 
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may also comprise , for example, a suitable 
transfecting or transformation agent. 

According to a further aspect of the invention 
there is provided a method of targeting a protein to a 
5 cell microtubule or the plus end region thereof, which 
method comprises introducing into a host cell, tissue 
or organism a transgene comprising a sequence capable 
of expressing UNC-53 or a vertebrate homologue thereof 
according to the invention, which sequence is operably 

10 linked to a sequence encoding said protein to be 

targeted such that a chimeric protein is expressed and 
which results in targeting said protein to said 
microtubule or a plus end region thereof. An even 
further aspect of the invention comprises a method of 

15 identifying a molecule which covalently modifies UNC- 
53 or a vertebrate homologue thereof according to the 
invention, which method comprises a) contacting either 
an extract from a cell or cells expressing UNC-53 or 
said vertebrate homologue or a mixture of enzymes 

20 comprising canditate UNC-53 modifying enzymes in the 

presence of an indicator of covalent modification of a 
protein, b) identifying any covalently modified UNC-53 
protein from step a) and c) identifying said molecule 
involved in said modification step. Such an indicator 

25 may be '-p. 

Further provided by the invention is a method of 
identifying a compound which alleviates or enhances 
the toxicity of UNC-53 or a vertebrate homologue 
thereof according to the invention, or which 

30 alleviates or enhances apoptosis. The method of the 
former comprises contacting said compound with a 
transgenic cell, tissue or organism according to the 
invention and monitoring for the presence of said 
reporter molecule adjacent said microtubules or the 

35 plus end regions thereof. In the case of apoptosis the 
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method comprises monitoring the effect of the compound 
on cell death. 

The invention may be more clearly understood from 
the following examples which are only exemplary, with 



Figure 1 illustrates the sequence of plasmid 
pTB72 which codes for the full length UNC-53 protein 
in C . eleaans , deposited under LMBP Accession No. 3486. 
Figure 2 illustrates the full-length UNC-53 

io protein from c, elegans- 

Figure 3 is a Tblastn search of the EST division 
of Genbank with the ORF of the longest known Ce-UNC-53 
cDNA. tb3-M5, reveals two EST 1 s with homology to a 
predicted coiled-coil region in Ce-UNC-53. 

15 Figure 4 illustrates a search of the Genbank 

databases with part of the nucleotide binding domain 
of Ce-UNC-53. It does not identify statistically 
significant proteins except for the C. elegans cosmid 
containing Ce-unc-53 . 

20 Figure 5 illustrates a three frame translation of 

EST gb:R41071. 

Regions of homology with Ce-Unc-53 in two 
different frames are underlined. The spacing between 
the blocks of homology is of similar size to that in 

25 Ce-UNC-53. Subsequent re-cloning and re-sequencing of 
this region in man identified multiple sequencing 
errors gb:R4107l, and identified an ORF which is more 
homologous to and co-linear with Ce-UNC-53 (see 
alignment in fig. 12) . 

3 0 Figure 6 is a BLASTN search of the EST division 

of Genbank with Hu-unc-53/1 cDNA cosmid 3b. 

Figure 7 is a TBLASTN search of the Genbank 
sequence database with the 961 amino acid ORF of cDNA 
3b of hu-UNC-53/l : hu-UNC-53/1 forms a unique pair 

35 with Ce-UNC-53 (cosmid F45E10) compared to the rest of 
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reference to the accompanying drawings wherein 
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the database. 

Figure 8 is a diagram illustrating the length and 
overlap and tissue source of the different cDNA clones 
of the 3' end of Hu-UNC-53/1 isolated in this work. 
5 Figure 8a. is a diagram illustrating the further 

sequence of the Hu-UNC-53/1 and overlap of constructs 
to obtain the further sequence. 

Figure 8b is a diagram illustrating the 3 1 end 
of Hu-UNC53/1 and the EST clones present in the 

10 database. 

Figure 9a is an annotated sequence listing of 
clone 3b of hu-UNC-53/1 including the EcoRl polylinker 
GAATTC . The predicted Open Reading Frame of Hu-UNC- 
53/1 is listed below the sequence. Blocks A B C D and 

15 E which are sixniliar to Ce-UNC-53/1, a region which is 
different between Hu-UNC-53/1 and Hu-UNC-53/2 and the 
3 1 untranslated leader sequences are marked with 
arrows and labelled. 

Figure 9b is an annotated sequence listing of 

20 Hu-UNC-53/1 available at this moment. The predicted 
Open Reading Frames of Hu-UNC-53/1, pLMl, pLM3 , pLM4 , 
pCB251, pLM5 and pCB201, the homology blocks A,B,C,D 
and E, the position of a region which is different 
between Hu-UNC-53/1 and Hu-UNC-53/2, the position of 

25 phhl4-3, pCB212, pCB210-14, phh3b, phhlS, the position 
of the reverse primers HU53rvl, HU53rv2, HU53rv3 and 
HU53rv4, the position of peptides B72628 (=28/1), 
B72627, B72626 and B72625 are listed below the 
sequence. 

30 Figure 10 is an annotated sequence listing of the 

insert of clone gbAA049124 (EST479167) of mu-UNC-53/1. 
The open reading frame and 3 • untranslated sequence is 
marked with an arrow. 

Figure 11a is an annotated sequence listing of 

35 the insert of clone gbH09036 (EST46037) of Hu-UNC- 
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53/2. 

Figure lib is a novel DNA sequence of HU-UNC-53/2 
extended by RT-PCR. This DNA sequence is not present 
in EST-4 6037 and extends the ORF beyond position 1109 
5 of Figure 11a to an ORF from position 18 to 1793. 

Figure 11c summarises how the 3 1 and 5* 
extensions of hu-unc-53/2 were made. 

Figure lid compiles the sequence of hu-unc-53/2. 
The boxed sequences are the primer sequences used for 
10 the respective extension steps described in the 
experimental methods section. 

Figure lie illustrates the sequences of the 
extensions summarised in figure 11c. 

Figure llf illustrates the sequence information 
15 illustrating four alternative Start sites observed for 
hu-unc-53/2 . 

Figure 12 . is an illustration of a Tblastn search 
of the EST division of Genbank with 680aa starting at 
the C-teminus of the alpha actinin domain of 
20 hu-unc-53/2. 

Figure 12a. is an illustration of an amino acid 
alignment of the available sequence of C.elegans 
unc-53 and hu-unc-53/1 and hu-unc-53/2. 

12b. is an illustration of similarity plots for 
25 Ce-unc-53 and hu-unc-53/1 (top) and for hu-unc-53/1 
and hu-unc-53/2 . 

Figure 13 is an annotated sequence listing of 
expression vector pCB201 containing homology block E 
from Hu-UNC-53/1 cloned in a pcDNA3.1-HIS expression 
30 vector. The HIS and T7-tags, PGR primer used to 
modify hu-UNC-53/1 and ORF are marked. 

Figure 14 is a diagram showing the alignment of 
the homologous regions of hu-UNC-53/1 and mu-UNC-53/1. 
Figure 15 is an annotated sequence listing of 
35 expression vector pCDU3 containing part of Ce-UNC-53/1 
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cloned in expression vector pcDNA3.1. The upper ORF 
starts in the vector polylinker. The lower ORF starts 
at the first Methionine and is part of Ce-UNC-53/1. 
Figure 16 is an annotated sequence listing of 
5 expression vector pCDU4 containing part of Ce-UNC-53/1 
cloned in expression vector pcDNA3.l. The upper ORF 
starts in the vector polylinker. The lower ORF starts 
at the first Methionine and is part of Ce-UNC-53/1. 
Figure 17 is an annotated sequence listing of 

10 expression vector pCDU2 containing part of Ce-UNC-53/1 
cloned in expression vector pcDNA3.1. The upper ORF 
starts in the vector polylinker. The lower ORF starts 
at the first Methionine and is part of Ce-UNC-53/1. 
Figure 18 illustrates MCF-7 cells transfected 

15 with pCB201 (upper) compared to mock transfected MCF-7 
cells (phase contrast image) . The control cells are 
spread out on the tissue culture plastic and 
exhibiting few filopodia outgrowths. The transfected 
cells appear smaller because they are slightly rounded 

20 up and have multiple filopodia outgrowths per cell 
(arrowheads) . 

Figure 19 is a phase contrast image of MCF-7 
cells, transfected with pcDNA3 . 1 (19a), pCDU4 (19b), 
pCDU3 (19c), pCDU2 (19d) and pTB72 (19e). 

25 Figure 20 is an F-actin pattern (visualized with 

TRITC-Phalloidin) of MCF-7 cells transfected with 
pcDNA3.LacZ (top panel) and with pCB201 (middle and 
lower panel) . 

Figure 21 is an F-actin pattern Phalloidin 

30 (visualised with TRITC-Phalloidin) of MCF-7 cells 
transfected with pCDNA3 . 1 (21a), pCDU4 (21b), 
pCDU3(21c), pCDU2 (21d) and pTB7 2 (21e) . 

Figure 22 is a phase contrast image of N4 
neuroblastoma cells transfected with pcDNA3 . 1 (22a), 

35 pCDU4 (22b), pCDU3 (22c), pCDU2 (22d) and pTB72 (22e) . 
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Figure 23 is an F-actin pattern Phalloidin 
(visualised with TRITC-Phalloidin) of N4 neuroblastoma 
cells transfected with pcDNA3.l (23a), pCDU4 (23b) , 
pCDU3 (23C), pCDU2 (23d) and pTB72 (23e). 
5 Figure 24 illustrates phase contrast images of 

small (top) , medium (middle) and large foci (bottom) 
induced in a monolayer of NIH3T3 cells by transfection 
with pCB201. 

Figure 25(c) illustrates human metaphase 

10 chromosomes probed with a probe lp34 and figures 25a 

and 25b indicating the chromosomal location of hu-UNC- 
53/1 in lq31. Essentially the same techniques were 
used to assign the gene hu-unc-53/2 to chromosome 
locus llpl5 (25d and e) as illustrated in micrograph 

15 25f. 

The ideograms 2 5a and 25d are from the 
International System for Human Cytogenic Nomenclature 
1985. The ideograms 2 5b and 2 5e in which the relative 
band positions and arm ratios were derived from actual 
20 chromosome measurements is from Cytogenet Cell Genet 
65:206-219 (1994) . 

Figure 26 is an expression pattern of HU-Unc53/l 
and HU-Unc53 2 in normal human tissues and cancer cell 
lines. 

25 Figure 27 is a sequence map of Plasmid pNP3 . 

Figure 28 is an examplary list of prosite 
signatures which can be used to define and identify 
vertebrate homologues of UNC-53. 

Figure 29 is a annotated sequence map of plasmid 
30 pEGFPsac. The GFP-C . elegans unc53sac fusion protein, 
and the C. elegans unc53 sac fragment are indicated. 

Figure 30 is a sequence map of plasmid pEGFP72. 
The GFP-C. elegans unc53 fusion protein and the 
C elegans unc53 fragment are indicated. 
35 Figure 31 is an annotated sequence map of plasmid 
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pEGFPsma . The GFP-C. elegans unc53sma fusion protein, 
and the C.e.unc53 sma fragment are indicated. 

Figure 32 is an annotated sequence map of plasmid 
pEGFPecl. The GFP-C . elegans unc53ecl fusion protein, 
5 and the C. elegans unc53 eel fragment are indicated. 

Figure 33 is an annotated sequence map of plasmid 
pEGFPxba. The GFP-C. elegans unc53xba fusion protein, 
and the C. elegans unc53 xba fragment are indicated. 

Figure 3 4 is an annotated sequence map of plasmid 
10 pLM4. Open reading frames of the hul-unc53/l and GFP 
are indicated. 

Figure 35 is a sequence map of plasmid pNP8 . 

Figure 36 is an illustration of microtubule 
association of C. elegans Unc53 / shown in HepG2 cells, 
15 transiently transfected with pTB72, expressing 

C. elegans Unc53. panel A:microtubule staining of 
HepG2 cells, using YL1/2 panel B:C. elegans Unc53 
staining, using rab4 . 

Figure 37 is an illustration of microtubule plus- 
20 end association in human cell lines transiently 
transfected with pTB72, expressing C.e.Unc53. 
C. elegans Unc53 was stained with mab-16-48. Panel C: 
COS cells showing microtubule association panel B: 
MCF7 cells showing microtubule plus-end association 
25 panel A: HepG2 cells showing microtubule plus-end 
association. 

Figure 38 is an illustration of microtubule 
association in N4 cells transiently transfected with 
pEGFP72, expressing the GFP-C. elegans Unc53 fusion 
30 protein. GFP fluorescence was observed in living 

cells. Panel A: microtubule association of the GFP- 
C. elegans unc53 fusion protein panel B: microtubule 
plus-end association of the GFP-C . elegans unc53 fusion 
protein. 

35 Figure 39 is an illustration of microtubule 
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association in N4 cells transiently transfected with 
pEGFP72, expressing the GFP-C. elegans Unc-53 fusion 
protein. Microtubules were stained with YL1/2 after 
paraformaldehyde fixation. Panel A: Microtubule 
5 association of the GFP-C. elegans unc53 fusion protein. 
Panel B: tubuline staining. Panel C: panel A plus 
panel B: co-localisation of the GFP-C . elegans unc-53 
fusion protein and Tubuline can be seen as yellow. 
Figure 40 is an illustration of microtubule 

10 association in N4 cells, transiently transfected with 
pEGFPsma , expressing the GFP-C. elegans unc53sma fusion 
protein. Panel A: Microtubule association of the GFP- 
C. elegans unc53sma fusion product. Panel B: Centriole 
association of GFP-C. elegans unc53sma fusion product 

15 when expressed at low levels. 

Figure 41 is an illustration of microtubule 
association in N4 cells, transiently transfected with 
pEGFPecl, expressing the GFP-C. elegans unc53ecl fusion 
protein. Panel A: Microtubule association of the GFP- 

20 C. elegans unc53ecl fusion product. Panel B: Centriole 
association of GFP-C . elegans unc53ecl fusion product 
when expressed at low levels. 

Figure 42 (a) /Figure 42(b) are illustrations of 
fluorescence of GFP in N4 cells transiently 

25 transfected with pEFPxba and pEFGPsac respectively. 

Figure 43 is an illustration of microtubule 
association of in N4 cells transiently transfected 
with pLM4 expressing GFP-HU-UNC53 / 1 fusion protein. 
Panel A: microtubule association of GFP-HU-UNC5 3/1 

30 fusion protein. Panel B: microtubule plus-end 

association of GFP-HU-UNC53/ 1 fusion protein. Panel 
C: microtubule association of GFP-Hu-UNC53/ 1 in 
dividing cells (end of division) . 

Figure 44 is an illustration of the sequence of 

35 Plasmid pNP9 . 
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Figure 45 is an illustration of iramuno 
fluorescence in melanoma G361 cells stained with sera 
28.1, Panel A: Microtubule plus-end association of 
Hu-UNC53/1. Panel B: microtubule plus-end association 
5 of hul-Unc53 in growth cone extensions. 

Figure 46 is an illustration of GFP fluorescence 
and immunofluorescence in N4 cells transiently 
transfected with pLM4 , and stained with sera 28.1. 
Panel A: Fluorescence of GFP-Hu-UNC53/1 fusion 
10 protein. Panel B: Immunofluorescence of serum 28.1. 

Figure 47 is an overview of the microtubule (+) 
end, the microtubule and f -act in cytoskeleton binding 
properties of different constructs 

Figure 50 is an illustration of rescue of lateral 
15 ALN neurons in mutant unc-53. 

Dorsal view of the ALN neurones axones visualise 
in GFP fluorescence with the transgene pA/GFP in the 
posterior of an adult, (c) cellular body. 

a) wild type, anterior axon (aa) migrates in a 

20 straight line along the body until reaching the head, 
on the dorsal sublateral cord, posterior axon (ap) 
migrates into the tail; 

b) unc-53 (nl52 ) , anterior axons are the shorter, stop 
ahead of the vulva region and form numerous collateral 

25 branches towards the dorsal cord; 

c) unc-53 (nl52) , pA/unc-53 anterior axons no longer 
form branches and migrate in a straight line into the 
head, as in the wild type at a) . 

scale bar 10 jim. 

30 Figure 51a : is an illustration of chimeric 

fusion between C. elegans and human 1 homologue of the 
unc-53 gene. The region of the putative nucleotide 
binding domain (NTP) is replaced in the C. elegans 
cDNA by the same region of the human homologue 1 of 

35 unc-53 (HI) . The cDNA is under the prornotor region A 
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(pA) of unc-53, which raise expression in the ALN 
lateral neurons. 

Figure 51b : is an illustration of the chimeric 
minigene nematode/human pA/unc-53-Hl partially rescue 
the defect in the longitudinal migration of the 
lateral neurons ALN and PLN. The four strains 
compared are : wt; unc-53 (nl52) ; unc53 
(nl52) , pA/unc-53; unc-53 (nl52) ,pA/unc-53-Hl. The 
observed phenotypes are put in three classes : 

^sauvage^] , the axon is straight, unbranched, and 
migrates until the head; ^vulve^ , the axon is 
straight , unbranched, and stops in the vulva region; 
^mutant^ , the axon is short, never joints the vulva 
region and made a lot of collateral branches. Numbers 
are in percentage. The number of observed axons are 
noted in the last column. The chimeric fusion between 
the C. elegans gene and human homolog (unc-53-Hl) 
partially rescues the mutant phenotype. The chimeric 
gene was maded by replacing the putative nucleotide 
binding region (NTP)of the nematode cDNA by the same 
region of the human homolog 1 (HI) . 

Figure 52 is an illustration of the sequence for 
plasmid pLM5. 

Figure 53 is an illustration of the sequence for 
plasmid pLM6 . 

Figure 54 is an illustration of the sequence for 
plasmid pLMl. 
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DEPOSITED MATERIALS 





Deposit 




A/»r Kir 




pC8201 plasmid DNA in E. coli 


3 December 1996 


LMBP 3594 


rr 
O 


Lambda clone 3B encoding hu-unc-53/1 


3 December 1996 


LMBP 3595 




MCF-7 clone z4 (mock) 


3 December 1996 


LMBP 1600CB 




MCF-7 clone (pCB201) 


3 December 1996 


LMBP 1601CB 




NIH-3T3 mock 


3 December 1996 


LMBP 1602CB 




NIH-3T3 pCB201 


3 December 1996 


LMBP 1603CB 


10 


pLM1 


13 November 1997 


LMBP 3762 




pLM4 


13 November 1997 


LMBP 3763 




PEGFP72 


13 November 1997 


LMBP 3764 




PCB501 


13 November 1997 


LMBP 3765 


15 


BAC clone comprising fragment of hu- 
unc53/2 gene 


15 November 1997 


LMBP 3773 




Worm strain with chimeric 
Celegans/human unc-53 gene 


15 November 1997 


LMBP-1663CB 



20 

The above plasmids and cell lines were deposited 
at the Belgian Coordinated Collections of 
Microorganisms (BCCM) at laboratorium voor moleculaire 
biologie - plasmidencollective (LMBP) B9000, GENT, 
25 Belgium, in accordance with the provisions of the 
Budapest Treaty of 28 April 1977. 

The present invention will now be described with 
reference to the following examples which are not 
limiting. 

30 

Identification of a human homoloaue of the UNC-53 

protein of c.elegangt 

Extensive searches with the ce-UNC-53 sequence 
(Figures l and 2) against the public domain databases 
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(EST, Genbank, EMBL, Swissprot and PIR) revealed no 
statistically significant homologies (a smallest sum 
probability (ssp) of 10 e - 8 is generally accepted to 
be significant at amino acid level) . Two ESTs 
5 gbH09036 (ssp = 1.1 e - 5) a Homo sapiens cDNA clone 
and gbAA049124 (ssp=8.6-5) a mouse cDNA clone showed 
homology to a "coiled coil" region a common motif in 
the contributing to protein secondary structure, 
(figure 3) 

10 All other candidate scores were are at background 

level (ssp >0.21). Careful examination of weak 
candidate ESTs identified EST gb:R41071 from Homo 
sapiens, which had obtained a low score of 53 and a 
non-significant probability score of 0.33 (Fig* 4). 

15 The inventors surprisingly discovered potentially 
significant homology with the Ce-UNC-53 nucleotide 
binding domain, provided multiple frameshifts and 
sequence errors were hypothesized. 

The inventors amplified, cloned and sequenced 

20 part of gb:R4i07i from human,, heart and human lung cDNA 
and from human genomic DNA and discovered that clone 
gb:R4107l had up to ten 10 different mistakes in the 
region checked. 5 extra nucleotides were scattered 
along its sequence and two nucleotide substitutions 

25 were identified, and gb:R41071 lacked three 

nucleotides present in our clone (Fig. 5) . The novel 
sequence obtained was two nucleotides shorter and 
showed the two UNC-53 -homologous regions in frame. 
The genomic fragment obtained is larger (700 bp total 

30 length) than the corresponding cDNA clones indicating 
the presence of an interverting sequence of around 500 
bp in nucleotide 162 of this fragment. The amplified 
cDNA fragment which was cloned to vector PCRII 
(Intvitrogen) and named pCR2 3l and was used as a probe 

35 to screen cDNA libraries. 
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The conceptual translation of the clones we 
obtained by PCR were screened using blast and tblastn 
against all known protein and DNA sequences in the 
database. The only clone which came up with 
5 statistically significant similarity was Ce-UNC-53 
(Fig. 6). This human clone and Ce-UNC-53 thus form a 
unique homologous pair compared to the rest of the 
known sequences, indicating the statistical relevance 
and novelty of our discovery. We designate this human 

10 gene as hu-UNC-53/1. Human heart and a human 

colorectal adenocarcinoma cDNA libraries were probed 
with pCR2 31 probe to identify longer cDNA clones. The 
clones overlap giving a linear sequence of 3706 bp 
(Fig 8 and 26) . This sequence shows an 959 amino acid 

15 open reading frame from the beginning of the clone. 

The absence of a 5 1 untranslated region suggests that 
the mRNA will extend 5 1 . 

Sequence alignment searches of the public domain 
databases with the DNA sequence of hu-UNC-53/1 and 

20 its' conceptual translation identified a series of 
ESTs most of which correspond to the 5 f UTR region. 
(Figures 7 and 8) . Surprisingly, hu-UNC-53/1 
identified also the cDNA clones gbH09036 and 
gbAA049124 homologous to the predicted coiled coil 

25 region in Ce-UNC-53 hu-UNC-53/1, and furthermore 

identified a third weakly homologous EST gbR21023. 
The inserts of gbH09036, gbAA049124 and gbR21023 were 
obtained from the Merck consortium and sequenced. 

gbAA049124 is >95% identical to Hu-UNC-53/1 over 

30 604 available amino acids (fig. 10) and is the mouse 
orthologue of Hu-UNC-53/1. The insert in gbH09036 is 
clearly homologous to hu-UNC-53/1 but derived from a 
different locus. We therefore name the gene 
identified by gbAA049124 Mu-UNC-53/1 and the gene 

35 identified by gbH09036 Hu-UNC-53/2. (Figure 11). 
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5 domains of high similarity mark the unc-53 gene 
family 

5 Ce-UNC-53 and the here-identified vertebrate 

homologues form a unique novel protein family, that is 
distant from the remainder of the proteins in the 
public domain. Alignment of the predicted open 
reading frames shows that Hu-UNC-53/1 and Hu-UNC-53/2 

10 are equidistant from Ce-UNC-53 . The highest homology 
is found in the carboxyterminal amino acids of Ce-UNC- 
53 region. The presence of a conserved GXXGKS/T box 
suggests a nucleotide binding function. However, this 
domain as a whole does not belong to a class of known 

15 nucleotide binding proteins. 

The similarity amongst the presently known 
sequence of the UNC-53 family of proteins is highest 
in 5 blocks over most of the available sequence (959 
amino-acids) and a firther block identified in Figure 

20 12a. These blocks can be assigned signature sequences 
as displayed in figure 28 or can be assigned weight 
matrices based on the alignment between the different 
family members. By using truncated constructs of Ce- 
unc-53, the functional relevance of these domains has 

25 been addressed. 

HU-UNC53/1 and Hu-UNC-53/2 are complex 
transcription units, 

30 I. A cancer cell line RNA blots probed with HU- 

Unc53/1. 

A Northern blot of poly-A+RNA from several 
cancer cell lines (Melanoma G361, Lung Cancer A549, 
Colorectal Adenocarcinoma SW480, Burkitt Lymphoma 
35 DRajii, Leukemia Molt4, Lymphoblastic Leukemia K562, 
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HeLa S3 and Promyelocyte Leukemia HL60) was probed 
using the whole insert of pHH3b. No or weak 
expression was detected in the Burkitt Lymphoma 
DRajii, the Leukemia Molt4 and the Promyelocytic 
5 Leukemia HL60 cell lines. Five different transcripts 
are detected in the remaining cancer cell lines: 
transcripts l and 2 are larger than 9.5kb, transcripts 
3 and 4 are 6 to 7 kb and the fifth transcript is 
around 6 kb. Transcripts 1 and 2 are present in all 

10 experssing cell lines. Transcripts 3 and 4 are 

restricted to Melanoma G361, Lung Cancer A549 and 
Colorectal Adenocarcinoma SW480 and are the 
predominant transcripts in Melanoma G361 and 
Colorectal Adenocarcinoma SW480. Transcript 5 is 

15 restricted to Lymphoblastic Leukemia K562 and HeLa S3 
and is predominant in HeLa S3. 



2. Cancer cell lines RNA blots probed with HU- 
UNC-53/2. 

20 A similar set of cancer cell line Northern 

blots were probed with a 652bp fragment of EST4 6037 
amplified by using the primers 5»- 

aggagatgaagctgacagatatcc and 5 1 -aaacaccagtgagtcc. HU- 
UNC-53/2 is expressed in Melanoma G361, Colorectal 

25 Adenocarcinoma SW480, Lymphoblastic Leukemia K562 and 
HeLa S3. No expression was detected in Lung Cancer 
A549, Burkitt Lymphoma DRajii, Leukemia Molt4 and 
promyelocytic leukemia HL60. Interestingly only 2 
transcript sizes were detected of around 7 kb 

30 expressed in Lymphoblastic Leukemia K562 and HeLa S3 
and a transcript of >9.5 kb in Melanoma G361 and 
Colorectal Adenocarcinoma SW480. 



35 



3. Normal Human tissue probed with HU-Unc53/l. 
A Northern blot of poly-A+RNA from normal 
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human tissue was probed using the whole insert of 
phage HH3b. Expression levels are low in all tissues 
with the highest level in heart and placenta, several 
fold lower levels in brain and testis, even lower 
levels in skeletal muscle, pancreas, thymus, colon, 
small intestine, ovary and prostate. Expression in 
peripheral blood leukocyte, lung, liver, kidney, 
spleenis barely detectable. 

4. Normal Human tissue probed with Hu-unc53/2. 

A similar set of blots were probed with a 
652bp fragment of EST46037 amplified by using the 
primers 5 1 -aggagatgaagctgacagatatcc and 5'- 
aaacaccagtgagtcc. Expression levels are low in all 
tissues with the highest level in kidney, lower levels 
in heart, placenta, lung, skeletal muscle and 
pancreas. Expression is barely detectable in brain 
and liver. 

The hu-UNC53/l and hu-UNC-53/2 homologues are 
clearly highly regulated genes, showing a strong 
tissue specificity and, probably, additional 
mechanisms of regulation (ie differential splicing of 
different promoters) . The different proteins derived 
from RNA's identified by probe hhl5 presumably share 
the carboxyterminal nucleotide binding domain. 
Ce-UNC-53 was shown to be a complex genetic locus and 
complex transcription unit. The different transcripts 
are thought to be a mechanism to assure the necessary 
specificity and functional diversity of this signal 
transduction pathway, with respect to different 
signals and receptors, different tissues and different 
directions of migration. The occurance of a new 
transcript or the observed changes in expression 
levels in the cancer cell line blot suggests a role 




WO 98/24810 



PCT/EP97/06956 



- 44 - 



10 



15 



20 



25 



30 



for hu-UNC-53/1 and hu-UNC-53/2 in the establishment 
or maintenance of the transformed state of those 
cells. 

Phenotypic changes in cells transfected with the 
Nucleotide Binding Domain of Ce-UNC-53/l and Hu-UNC- 



Ectopic expression of full length Ce-UNC-53 in C. 
elegans, murine neuroblastoma cells or human MCF-7 
breast-carcinoma cells, has been found to lead to 
increased filopodia outgrowth and increased motility 
(unpublished) , The structure of Ce-UNC-53 protein is 
reminiscent of that of large kinases or dynamin where 
a catalytic domain is postively or negatively 
regulated by domains that interface with signal 
transduction pathways for example (by by GRB2 binding, 
phosphorylation or the like) . The inventors therefore 
decided to test whether the nucleotide domain by 
itself is capable of inducing the observed changes in 
the microfilament cytoskeleton and motile or ruffling 
behaviour. 

cDNA fragments coding for the nucleotide binding 
domains of Ce-UNC-53 and Hu-UNC-53/1 were cloned in 
mammalian expression vectors with the CMV promoter 
(see experimental procedures) . 

To be able to detect expression from pCB201 (Fig. 
13) , an N-terminal his and a T7 epitope tag were fused 
in frame with the hu-UNC-53/1 cDNA hhl5. pCDU3 
contains a larger fragment of Ce-UNC-53 and starts 
just before the conserved "VIELKIEL" domain (Fig. 12). 

The empty pcDNA3 vector or pCDNA3 . 1-His-LacZ, a 
mammalian expression vector for E. coli Beta- 
galactosidase, was used as a control vector (mock 
transf ection) . The differences between mock and 
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transfected N4 and MCF-7 clones were analysed using 
phase-contrast and Nomarski microscopy coupled with 
time lapse analysis, phagokinesis and immunocyto- 
chemical characterisation of the F-actin. 

PhenotVPic changes in mouse N4 neuroblastoma 



N4 neuroblastoma cells were stably transfected 
with control construct pCDNA3 . 1 and the C. eleaans 
UNC-53 constructs pTB72, pCDU2 , pCDU3 and pCDU4 . The 
population of clones transfected with the empty 
expression vector were homogeneous and similar to wild 
type N4 cells. In contrast thereto, 1/4 to 50% of the 
clones transfected with pTB72, pCDU2 , pCDU3 and pCDU4 
(see experimental procedures and Figs. 1,17,15 and 16 
respectively) had distinct phenotypes: 

1. Wild type or N4 cells transfected with 
pcDNA3, designated as mock transfection show a central 
cell body, with extensions, designated as neurite 
outgrowths. Less than 5% of the population have 
lamellae. When present, they are generally situated 
on the cell body and on the opposite site of the 
neurite extensions (figure 22a). The lamellae show a 
radial actin spike pattern. Limited branching of the 
actin fibres is observed in wild type or pcDNA3 
transfected N4 cells. Side branches are smaller and 
can be clearly distinguished from the main actin 
branch (figure 23a). 

2. N4 cells, stably transfected with pCDU4 , 
harbouring the homology block E, show an overall 
morphology which is similar to that of wild type N4's 
(a cell body with neurite outgrowth) . They exhibit 
however an increased frequency and level of lamellae 
formation (figure 22b). These lamellae, which contain 
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F-actin microspikes are found on both the cell body 
and the neurite outgrowth (figure 23b) . Wild type N4 
cells , in contrast thereto, rarely exhibit lamellae on 
the neurite outgrowths, 
5 3. N4 cells, stably tranfected with pCDU3 , 

encoding for homology blocks C, D and E, show an even 
higher level of lamellae formation labelled with 
TRITC-phalloidin, the cells appear surrounded with P- 
actin fibres, consisting of bundles of F-actin 

10 microspikes (figure 23c) . The presence of these 
lamellae has completely modified the general 
appearance of the cells. They appear flatter and in 
90% of the population, it is not possible to 
distinguish between the cell body and the wide neurite 

15 as they flow gradually into one another (figure 22c) . 
If wild-type-like thin neurite-like outgrowths are 
present, they are frequently numerous, branched and 
located all around the cell. 

4. The overall morphology of N4 cells, stably 

20 transfected with pCDU2 , encoding for homology blocks 
A, B, C, D, and E, resembles that of the wild type 
cells since, cell body and neurite outgrowth can be 
clearly distinguished. The pCDU2 transfected cells 
however show more neurite outgrowth, and these are 

25 long and very branched, especially at the end of the 

outgrowth. When neurite outgrowths of different cells 
make contact, increased branching can be observed, 
giving the appearance of a network (figure 22d) . N4 
cells, transfected with pCDU2 , show bundles of long 

30 radial F-actin filaments (microspikes) , which can be 
branched, especially apically. The space between the 
hand-shaped actin spikes is mostly filled in with 
actin, leading to small lamellae-like structures. 
Also the network-like branching between the cells 

3 5 shows both the bundled actin structures and the 
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lamellae-like fill-in features. These dense F-actin 
structures are sometimes seen on the cell body, which 
enhances the network-like appearance of the cells 
(figure 23d) . 

5. N4 cells, stably transfected with plasmid 
pTB72, encoding the full length C. eleaans UNC53 
protein, seem to have a more rigid structure than wild 
type cells, most clearly seen as spindle-like and 
triangle-like cells. The corners of these cells show 
an increased level of hand-like lamellae structures. 
This specific phenotype is best seen when the cells 
are grown at low density (figure 22e, Fig. 23e) . 

Phenotypi c changes in hum an breast carcinoma MCF- 
7 CQlla 

MCF-7 cells were stably transfected with the 
pTB72, pCDU2, pCDU3 , pCDU4 and pCB201. The population 
of clones transfected with the LacZ-expression vector 
were homogeneous and similar to wild type MCF-7 cells. 
In contrast thereto, -30-50% of the clones transfected 
with pTB72, pCDU2, pCDU3 , pCDU4 and pCB201 had 
distinct phenotypes which were analysed as above for 
the N4 cells: 

1. Wild type and mock (pcDNA3) transfected MCF-7 
cells are heteromorph. In general they are round 
cells or clusters of cells surrounded by lamellae. 
Bulges, similar to thick filopodia, can be observed 
(figure 19a) . When the cells are stained with FITC- 
or TRITC coupled phalloidin, F-actin actin stress 
fibres can be observed, often in rings surrounding the 
cell body (figure 20a & 21a) . When cells are round up 
like this actin is present at the edge of the cell 
body. Less than 10% of the cells display filopodia 
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filled with radial F-actin microspikes. In time-lapse 
analysis the cells are highly quiescent with limited 
ruffling at the edge of the cell, 

2. MCF-7 cells transfected with pCDU4 , encoding 
5 for homology block E, show two major phenotypic 

differences compared to the wild type cells. These 
cells are more flat and have more extended 
lamellipodia leading to a pancake-like appearance. 
Some clones show more filopodia than wild type (figure 

10 19b) . Radially organised F-actin fibres can clearly 
be observerd in the lamellae surrounding the cells. 
These stress fibres resemble the wild-type structures, 
but have a more radial than circular orientation. In 
the filopodia, one can observe an increase of 

15 apparently unorganised, bundles of actin patches 
(figure 21b) . 

3. MCF-7 cells, stably transfected with pCDU3, 
encoding the homology blocks C, D, and E, shows a 
strikingly different and constant morphology. The 

20 cells appear smaller than wild type because they are 
more rounded up. All the cells have more filopodia, 
surrounding the cell body (figure 19c) . 
Morphologically these filopodia have the same "hand- 
like" appearance as those observed in N4 neuroblastoma 

25 cells. Such filopodia are hardly ever observed in 
mock transfected MCF-7 cells. These filopodia are 
filled with F-actin fibres. Compared to wild type 
cells, fine actin stress fibres are decreased (figure 
21c) . In time-lapse analysis single cells as well as 

30 clusters of cells can be seen to ruffle much more 
dynamically than single or clusters of wild type 
cells. The "half-life'* of a filopodia outgrowth on 
the cell surface is much shorter in transfected cells 
and the numbers of filopodia present at any time 

35 higher. 
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4. Cells transfected with pCB201 (which is 
structurally similar to pCDU4 but human) has a 
phenotype that is nearly indistinguishable from that 
of cells transfected with pCDU3 except that the 
observed phenotype and ruffling activity and filopodia 
outgrowth is even higher than pCDU3 (figure 18) . 

5. The overall morphology of the MCF-7 cells 
transfected with pCDU2 , which encodes the homology 
blocks A, B, C, D and E, resembles that of the pCDU3 
transfected cells. The cells are more rounded up and 
show more filopodia than the wild type and mock 
transfected cells (figure 19d) . The filopodia, which 
are all around the cell body tend to be longer, and 
show a difference in actin organisation. The small 
filopodia have the same actin bundles as seen in the 
pCDU3 transfected cells. In the longer filopodia, the 
actin bundles are more parallel, and radial to the 
cell body (figure 2ld) . 

6. MCF-7 cells transfected stably with pTB72, 
encoding the full length UNC53 protein, are extremely 
rounded up, and tend to adhere more than wild type 
cells. The cells grow in clusters with sausage- or 
tube-like shapes. The presence of large extremely 
thin lamellae with a surface area of more than three 
times the central cell body forms a second 
morphological feature, unique for the pTB72 
transfected MCF-7 cells (figure 19e) . These sheets 
are difficult to observe under a phase contrast 
microscope, but are very clear when stained with 
phalloidin. The lamellae protrude from one side of a 
cell or group of cells and are filled with thin long 
criss-crossing actin fibres, different from "giant" 
wild type MCF-7 cells (figure 21e) . 

These experiments lead to the following set of 
conclusions: (Figure 47 summarises the data of the 
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domain swapping experiments in C. elegans unc-53) 

1. Murine and human cells transfected with the 
Ce-UNC-53 or hu-UNC-53/1 domains show clear effects on 
the nature and dynamics of their motile behaviour as 
demonstrated by changes in the F-actin cytoskeleton 
(the increase in lamellipodia, hand-like filopodia and 
"hair-like" microspikes on the cell surface and the 
associated reduction of the "rings of F-actin" stress- 
fibres) . 

2. This effect is found in two cell types of 
different species and tissue origin: MCF-7 cells 
(human breast carcinoma cells of epithelial origin) 
and murine N4 neuroblastoma cells. pCB201, pCDU3 and 
pCDU4 induce in MCF-7 cells a type of filopodium which 
is frequent in wild type N4 cells but rare to absent 
in wild type MCF-7 cells, suggesting the activation by 
these constructs of motile behaviour which is "normal 1 ' 
in N4 cells but of an unusual type in MCF-7 cells. 
This indicates the activation of a specific downstream 
process as opposed to a disruption of an existing 
process. It is well known that some cell types prefer 
to migrate with filopodia and other cell types with 
lamellipodia. 

3. Expression of pCB201, pCDU3 and pCDU4 gives 
qualitatively similar F-actin remodelling and 
increased filopodia and lamellipodia outgrowth. 
pCB201 and pCDU3 are however much more active in this 
process than pCDU4 . 

4. pCB201 is a much more potent activator of 
filopodia outgrowth than pCDU4 , which is to be 
expected considering the large evolutionary distance 
between between C. elegans and vertebrates. 

5. These experiments identify homology domain E 
(predicted nucleotide binding domain) of UNC-53 as the 
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"domain" that activates F-actin remodelling and 
f ilopodia/lamellipodia outgrowth* Progressive 
addition of the aminoterminal homology A,B,C,D lead to 
qualitative and quantitative modulation of the 



6, Homology domains C and D (pCDU3) "enhance the 
basic activity present in homology domain E 
(pCDU4/pCB201) . 

7. Homology domains B and C (pCDU2) 

10 qualitatively modify the phenotype of domain E, 
leading morphologically different lamellipodia 
formation than pCDU3 transfected cells. It is thought 
that lamellipodia and filopodia formation are mediated 
by different signal transduction pathways requiring 

15 two related but different Ras-like G-proteins RAC for 
lamellipodia formation and CDC42 for filopodia 
formation. 

8, pTB72 which includes homology domains 
A,B,C,D,E plus an additional 700 amino acids not yet 

20 identified isolated in the human members of the family 
confers a more localised filopodia outgrowth and a 
different morphology. 

9. The expression levels of pTB72 (full length 
Ct elegant UNC-53) , pCDU3, pCDU4 and pCB201 are 

25 extremely low. The observed effect is therefore 

unlikely to be due to dominant negative effects (such 
as stoichiometric depletion of other cellular 
components) or structural changes in the actin 
cytoskeleton mediated by UNC-5 3 or its fragments. 

30 The data point to a multi-domain organisation in 

UNC-53 whereby the aminoterminal domains exert 
positive (e.g. pCDU3) and negative (e.g. pCDU2) 
control on the activity of the domain E or are leading 
to novel activities or the localiation of the activity 

35 in the cell (pCDU2, pTB72) . Our observation that the 



5 



phenotype present in domain E . 
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nucleotide binding domains (NTB) of distantly related 
members of the UNC-53 family induce similar 
phenotypes, suggests a general role for this domain of 
the UNC-53 family. 



C ELLULAR ASSAYS TO IDENTIFY PHARMACOLOGICAL 
MODULATORS OF UNC-5 3 AND COMPONENTS OF TOE PNG-53 

PATHWAY 

Mammalian and human cells transfected with 
plasmid constructs containing unc-53 sequence of 

either eleaans or of human origin were observed to 

display obvious, specific and similar changes in 
comparison to mock or untransf ected parent cells. 
These changes relate to the functioning of the 
cytoskeleton, in particular the F-actin cytoskeleton, 
to cell locomotion and directionally cell motility and 
reflect UNC-53 gene family members as capable of 
playing an integrator function in cell motility. 

The cellular tools derived through transfection 
and derived functional assays with these cells not 
only enable characterisation of the motile phenotype 
typically observed after introduction of unc-53 genes, 
they also can be easily adapted to screen for 
pharmacological compounds that interfere with either 
(1) the expression of unc-53 gene family members, (2) 
the cellular functioning of unc-53 transgene(s) and of 
components in the unc-53 signal transduction pathway. 

Two classes of pharmacological modulators are 
envisaged. 

A first class are inhibitors of UNC— 53s or the 
unc-53 pathway (s) , which revert the described 
phenotypic changes induced by unc-53 transgenes or 
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aspects thereof. Such compounds are considered 
relevant leads to target diseases where unwanted 
directional motility of cells occurs such as 
metastatis, angiogenesis or inflammation, 
5 Secondly, pharmacological stimulators are 

envisaged, such as compounds which induce - in non- 
transfected cells - phenotypes that induce or mimick 
(aspects of) the described "unc-53" phenotype. Such 
compounds may do so by inducing or upregulating 

10 expression levels of a known unc-53 gene or by 

activating endogenous (yet unidentified) members of 
the unc-53 gene family. The target application here 
are wound and tissue repair, in particular diseases 
such as neuronal regeneration and plasticity. 

15 The nature of compounds envisaged can be small 

(organic) molecules, bio-molecules (such as peptides, 
sense or antisense (oligo-) nucleotides or chemical 
modifications thereof. Alternatively, compounds can 
be thought of as a series of plasmid nucleotide 

20 constructs containing gene sequences in a screen for 

novel unc-53-unrelated genes with a similar functional 
effect in the cell or genes related to the unc-53 gene 
family or novel members of the unc-53 gene family 
based on sequence similarity such as for example the 

25 genes in plasmids pTB72, pcDU3 , pcDU4 , pcDU2 , pcB201, 
or modifications thereof such as for example epitope 
tagged, deletion, complementation or mutagenised 
nucleotide constructs. 

The cellular assays envisaged in the claims have 

30 been exemplified for three cell lines: the human 

breast carcinoma cell line MCF-7 , the mouse neuronal 
cell line N4 and the mouse fibroblast cell line NIH- 
3T3. Pharmacological assays are focused on 
quantification of endpoints in a high throughout 

35 screening mode. Many of the computer aids for 
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(semi-) automation are well known to the field and 
currently applied in the applicants labs. Given the 
subtlety of the phenotypes observed, primary focus was 
given to morphological assays that assess the 
5 phenotypes or aspects thereof. 



The nucleotide binding domain of Hu-UNC-53/1 has 
transforming activity in NIH3T3 fibroblasts 

10 

Biochemical and genetic analysis suggest that 
UNC-53 functions in GRB-2 mediated signal transduction 
pathways controlling cell motility. The occurence of 
an altered hu-UNC53/l mRNA pattern in cancer cell 

15 lines, moved us to investigate if whether hu-UNC53/l 
plays a role in the transformed state of those cells. 

Thereto, we tested the ability of the nucleotide 
binding domain of hu-UNC-53/1 and Ce-UNC-53 to 
transform NIH/3T3 cells. Construct pCB201 (hu-UNC- 

20 53), which induces ruffling behaviour and cell 
motility, were transfected into NIH3T3 cells. 
Positive controls included Myc and H-ras. Negative 
controls included empty vector adn Rac 1N17 and 
cdc42N17. 

25 The cells that survived G418 selection were 

assayed for loss of contact inhibition (their ability 
to grow as foci) . Positive controls included the 
combination of two well known oncogenes Myc and H-ras 
which were able to produce a high number of foci. The 

30 nucleotide binding domains of both Ce-UNC-53 and hu- 

UNC-53/1 are able to induce foci in this assay (Fig 24 
& Table 1) . 



35 
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10 

This suggests that the function of UNC-53 is not 
restricted to the activation of motility, UNC-53 may 
exert this additional function through the activation 
of as yet to be identified signal transduction 

15 pathways. Oncogenes frequently arise when a 

"controlling" domain and "activation" domain are 
separated though chromosomal rearrangements or 
integration of a part of a gene in the oncogenic 
virus. E.g. Erb Receptor tyrosine kinases, Ost a 

20 nucleotide exchange factor for Rac-l. 

Hu-UNC-53/l is localized to chromosome lq31,l 

Clone F226 (BACH-135 (014), Genome Systems, inc) 
25 was isolated from a human genomic BAC library using 
pCR231 as a probe and was confirmed by sequence 
analysis to be derived from the hu-UNC-53/1 locus. 
Purified DNA from clone F226 was labeled with 
digoxigenin dUTP by nick translation. Labeled probe 
3 0 was combined with sheared human DNA and hybridized to 
normal metaphase chromosomes derived from PHA 
stimulated peripheral blood lymphocytes in a solution 
containing 50% formamide, 10% dextransulf ate and 2X 
SSC. Specific hybridization signals were detected by 
35 incubating the hybridized slides in f luoresceinated 

antidigoxigenin antibodies followed by counterstaining 
with DAPI. The initial experiment resulted in 
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specific labeling of the long arm of a group A 
chromosome. A second experiment was conducted in 
which an anonymous probe which was previously mapped 
to lp34 and confirmed by cohybridization with a 
5 chromosome 1 centromere specific probe, was 

cohybridized with F226- The experiment resulted in 
the specific labeling of the long and short arms of 
chromosome 1. Measures of 10 specifically hybridized 
chromosomes 1 demonstrated that F226 is located at a 

10 position which is 52% of the distance from the 

heterochromatic-euchromatic boundary to the telomere 
of chromosome arm lq, and that corresponds to band 
lq31. At total of 80 metaphase cells were analyzed 
with 72 exhibiting specific labeling (Fig. 25) . 

15 Gains of DNA sequences in 1Q31 were found in more 

than 10% of primary bladder tumors (Genes Chromosom 
Cancer 12: 213-219 (1991)). A putative tumor 
suppressor gene located near the locus F13B on 
chromosome arm Iq31-q3 2 appears to be involved in the 

20 pathogenesis of medulloblastoma (Int. J. Cancer 67: 

11-15 (1996)). Loss of heterozygosity in this region 
of chromosome I has been implicated in development of 
human hepatoblastoma. Partial trisomies of lq31 were 
found in Ewing's Sarcoma cell lines isolated from 

25 patients Cancer Genet Cytogenet 12: 1-19 (1984). 

HU-UNC-53/2 is localised to Chromosome llplS.l 

DNA from clone F329 from BAC for Hu-unc-53/2 was 
30 labeled with digoxigenin dUTP by nick translation and 
applied in the experimental settings used for FISH of 
Hu-unc53/l with F226. The initial experiment with 
F329 resulted in the specific labeling of the mid 
short arm of a group C chromosome which was believed 
35 to be chromosome 11 on the base of size, morphology 
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and banding pattern. A second experiment was 
conducted in which a biotin labeled probe specific for 
the centromere of chromosome 11 (D11Z1) was 
cohybridised with clone F329. This experiment 
5 resulted in the specific labeling of the centromere in 
red and the mid short arm in green of chromosome 11. 
Measurements of 10 specifically labeled chromosomes 11 
demonstrated that F329 is located at a position which 
is 65% of the distance from the centromere to the 
10 telomere of the chromosome lip, an area which 

corresponds to band llpis.l. A total of 80 metaphase 
cells were analysed with 72 exhibiting specific 
labeling. 

15 Chromosome llpl5 is a region showing loss of 

heterozygosity (LOH) in a variety of human 
malignancies, primarily breast cancer (Ali et al., 
Science 238, 185-188 (1987); Winqvist et al., Cancer 
Res. 53, 4486-4488 (1993)) but also Wilms 1 tumor 

20 (Dowdy et al., Science 254, 293-295 (1991); Cowell et 
al., Br. J. Cancer 67, 1259-1261 (1993)), ovarian and 
testicular malignancies (Lothe et al., Genes 
Chromosomes Cancer 7, 96-101 (1993); Weitzel et al., 
Gynecol Oncol. 55, 245-252 (1994)) stomach cancer 

25 (Baffa et al., Cancer Res. 56, 268-272 (1996)), lung 
cancer (Ludwig et al., Int. J. Cancer 49, 661-665 
(1991); Fong et al., Genes Chromosomes Cancer (1994)), 
infantile tumors of adrenal and liver (Byrne et al., 
Genes Chromosomes Cancer 8, 104-111 (1993)). Since 

30 LOH is believed to indicate inactivation of a tumor 

suppressor gene at the location where LOH occurs, the 
frequent LOH found at llplS in multiple human cancers 
suggests the presence of either a cluster of tumor 
suppressor genes or a single tumor suppressor in this 

35 region (Seizinger et al., Cytogenet. Cell genet. 58, 
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10080-10096 (1991)). Chromosome transfer studies have 
shown that chromosome 11 can suppress tumor igenicity 
of both human breast cancer (Negrini et al., Cancer 
Res. 55, 3003-3007 (1995)) and Wilms 1 tumor cells 
(Dowdy et al., Science 254, 293-295 (1991)) and a gene 
(named HTS1 or ST5) that may be responsible for 
suppressing tumorigenicity in HeLa cells has been 
mapped to llpl5 (Lichy et al., Cell Growth Diff. 3, 
541 — 548 (1992)). Abnormalities at llp!5 have also 
been identified in a variety of other cancers, 
including lung cancer (parental origin of llpl5 
deletion) (Kondo et al., Oncogene 9, 3063-3065 
(1994)), bladder cancer (Presti et al., Cancer Res. 
51, 5405-5409 (1991)), myeloid leukemia 
(translocation) (Nakamura et al., Nat. Genet. 12, 154- 
158 (1996)), malignant astrocytomas and other 
primitive neuroectodermal tumors (deletions) (Fults et 
al., Genomics 14, 799-801 (1992)), rhabdomyosarcoma 
(Scrable et al., Nature 329, 645-647 (1987)) and 
hepatocellular carcinoma (Fujimori et al., Cancer Res. 
51, 89-93 (1991); Wang et al., Cell Genet. 48, 72-78 
(1988)). Recently a gene, TSG101, was cloned that is 
mutated in human breast cancer and deleted in 
uncultured primary human breast carcinomas (Li et al., 
Cell 88, 143-154 (1997)). 

DIAGNOSTIC ASSAY USI NG THE DNA SEQUENCE OF HUMAN 

The differential expression of human unc-53 
transcripts in Northern blots of normal tissues versus 
transformed cell lines and the chromosomal locus of 
hu-unc-53/1 at lq31 being a locus linked to three 
diseases, suggests the potential implication of hu- 
unc-53 genes in oncogenesis. By using the complete 
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DNA sequence of hu-unc-53/1 or / 2 or fragments thereof 
in FISH, the potential involvement of these genes can 
be diagnosed in patients as exemplified in figure 26. 
Alike , the use of these hu-unc-53 sequences in 
5 diagnostic PGR assays can be used to determine 

overexpression of hu-unc-53s or fragments thereof. 

Assay for microscopic phenoty pic UNC-53 
transfected MCF-7 cells 

Mock and unc-53 transfected MCF-7 cells were 
seeded at low density in culture plates and allowed to 
adhere to the vessel. Light microscopic inspection at 
different time points either on live cells or after 
chemical fixation with Karnovsky 1 s fixative revealed 
that in pcB201, MCF-7 transfected cultures a rounded 
shaped cell body with at their boundaries many 
filopodia. In contrast, mock or untransf ected clones 
had a predominant "flat* phenotype - with little or no 
filopodia. Quantitative measurements confirmed the 
statistical significance of this shift in phenotype 
(table 2 below) . 

TABLE 2 

2 5 Quantification of phenotypic changes in unc-53 transfected MCF-7 cells Q 



Transfection: clone no feet (**) with feet (*•) fraction with feet 

mock e 34 8 0.19 

37 0 0 

pcB201 2 17 92 0.84 

37 83 0.69 

16 27 62 0.70 

20 71 0.78 

13 85 0.87 



10 



15 



20 



30 



(*) Clones were passaged thrice, frozen and stored. 
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Thawed cells were trypsinised at confluency, 
monodispersed, seeded in flasks and allowed to attach 
to substrate overnight to 48 hours. Cultures were 
fixed with Karnovsky fixative and inspected using 
phase contrast microscopy. In parallel experiments, 
resistance to geniticin was confirmed. 

(**) values are expressed as cells per 
microscopic view. 

Assay for ruffling and mot ile behaviour using 
^Mtom^t^d tiros lapse 

The dynamic changes in cells are well known in 
the field. Animations of e.g. actin ruffles in 
astrocytoma cells or od actin based cell motility in 
e.g. fibroblasts can be accessed 

(http: //www. stc.cmu. edu/CLMIBhp/Imggallpg/Moviespg/ 
act inruf f le . mov) or 

(http://util.ucsf.edu/mitchi/Movies/migration.html) on 
the world wide web. The dynamic changes as a result 
of transfection with unc-53 can best be appreciated in 
time lapse video sequences. At high magnification, 
the "filopodia 1 display arrays of microspikes with 
highly dynamic behaviour. A rough visual estimate 
suggests these phenomena to be at least 10-fold 
increased in pcB201 transfected cells relative to the 
mock-transfected MCF-7 cells. Animations of these 
clones in NIH-Image can be requested from author or 
applicant. 

Time lapse video imaging probably is the most 
informative way to appreciate the unc-53-induced 
phenotype in MCF-7 and is amenable to high throughput 
screening in a pharmacological context. Time lapses 
compressing 5 minutes real time supply sufficient 
information to quantitate the intensity of the motile 
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behaviour of pcB201 transfected MCF-7 cells in e.g. 12 
well plates. In addition, algorithms have been 
described in the field which can automatically compute 
the "motile area 1 of cells by comparing cells in two 
images appropriately spaced in time (van laerebeke 
etal., 1992, cytometry, 13, 1-8). 

Assay for visualising unc-53-induced T-actin 
recruitment in mhf-7 cells 

Cultures were chemically fixed, detergent 
extracted and f luorescently stained for F-actin 
(f ilamentous-actin) using f luorescently labeled 
phalloidin (Wieland et al., 1985, Int. J. Peptide & 
protein Res, 21, 3-10) which display in a more 
specific way the dramatic phenotypic changes to 
transfection with unc-53 transgenes. By using image 
capturing and analysis of the F-actin patterns, image 
analysis algorithms well known in the field can assess 
in an automated way, the f-actin filament positions, 
texture and distribution relative to the nuclear 
position or gravity point of the cells. Such 
algorithms are capable of discriminating phenotypic 
changes and thus also effects of pharmacological 
inhibitors of transgene-induced phenotypes as well as 
compound induced unc-53 like phenotypes in mock or 
untransfected cells. 

PhqgpRinesis assay for unc-53-induced 
directionality and quantity of motility 

The methods are described in the experimental 
section* Two cell populations with different motile 
behaviour in phagokinesis assays were observed. In 
table 3 below the fraction of mock and UNC-53 
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transfected MCF-7 cells that produced linear tracks in 
the phagokinesis assay are shown. In the mock 
transfected MCF-7 cells, 61% of the cells produce a 
round track (long and short axis less than 2-fold 
different) and 39% cells produced "linear 1 tracks 
(long and short axis more than 2 -fold different) ♦ 
pcB201 transfected MCF-7 cells produced an increase of 
the fraction of cells displaying "linear 1 tracks to 
50%. An increase in the fraction linear tracks was 
made for MCF-7 cells transfected with full sequence 
Ce-unc-53 . 

In addition, a significant increase of 50% in the 
median area of tracks of a culture vessel was observed 
in the pcB2 01 transfected MCF-7 cells versus mock 
transfected MCF-7 cells (Table 2). These observations 
suggest that pcB201 as well as pTB72 transfection into 
MCF-7 cells is capable of increasing In situ 
locomotion in Ce-UNC-53 MCF-7, e.g. by increasing 
spreading, ruffling, or other forms of non-directional 
motility in the "round 1 population. In addition the 
Ce-UNC-53 transgene in MCF-7 cells drives a fraction 
of the MCF-7 cells from non-directional motility 
(round tracks) into directional migration (linear 
tracks) . Clone 2 thus provides a tool to analyse 
inhibitory or stimulating effects of pharmacological 
compounds on directionality or quantity of cell 
motility in relation to UNC-53. 
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Table 3. Analysis of motility in phagokinesis assays 

Track morphology: fraction linear tracks 



5 

plasmid clone 

Mock z4 

10 

PCB201 Clone 2 

15 

Track Size 

Clone average+SD 

20 

24 1626±188 

Clone 2 2326±283 



round linear 1/r 

18 13 0.42 

17 11 0.39 

22 12 0.35 

16 9 0.36 

13 13 0.5 

7 8 0.53 

9 9 0.5 



min max (N) 

1444 2011 (8) 

1989 2816 (8) 



25 Assays for the localisation of unc -53 in the cell 

tP microtubules or microtubule t+) plus ends 

UNC-53S have been shown to reside on microtubules 
and preferentially on the microtubule (+)-ends of 

30 cells. This localisation represents an important 
feature of the UNC-53 family of proteins, which is 
rarely observed in other proteins. Absence of 
microtubule (+)-end binding in the protein APC 
following mutation has been implied in the role of APC 

35 in colon cancer (Smith et al., 1994, Cancer Res., 54, 
3672) . In analogy, it can be postulated that the 
proper functioning of UNC-53 also may depend on its 
specific localisation in the cell. 

The methods used in the examples which prove the 

40 co-localisation with microtubules form a base for a 
series of assays for compounds which specifically 
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affect microtubule (-f)-end binding of UNC-53S. To the 
skilled eye, the typical localisation of an UNC-53 
protein on microtubules can be readily recognised and 
thus is sufficient for the interpretation of whether 
5 the treatment with a compound has affected the 

localisation of this UNC-53 (or a fragment thereof) . 
Moreover, by combining the described methods (co- 
localisation) - well known to one skilled in the field 
and exemplified by the methods in the "experimental 

10 procedures" section - one can unequivocally confirm a 
compounds ability of abrogating (or promoting) 
microtubule and microtubule (+)-end binding. 

Such an assay comprises contacting a cell culture 
of a cell line expressing an UNC-53 with a compound in 

15 the culture conditions proper for the said cell line / 
followed by an incubation and finally observation of 
the UNC-53 (or fragment) in situ by e.g. fluorescence 
microscopy (for GFP-chimeras) or by fixing the cell 
culture and performing an immunocytochemical staining 

20 for the UNC-53 (or fragment) . For the co- 
localisation, methods such as immunocytochemistry for 
the microtubules of a cell or cell line combined with 
either immunocytochemistry for Ce-UNC-53 or Hu-UNC-53s 
or fluorescent detection GFP-UNC-53 chimeras are 

25 performed consecutively. 

Celeqa n3-VNC-53 preferentially binds microtubule 

plug-ends or G TP-tubm in 

30 Biochemical characterisation of UNC-53 has shown 

that UNC-53 binds the SH3 binding domains of SEM- 
5/GRB-2 and binds F-actin in vitro. GRB2 has been 
localised to the cortex of the cell and reported to be 
involved in the control of cell motility. To 

35 determine the in vivo subcellular localisation of Ce- 
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UNC-53, we transiently transfected COS, HepG2 and MCF7 
cells with pTB72, an expression construct containing 
the full length Ce-unc-53 cDNA. This construct was 
previously shown to activate cell motility in N4 
neuroblastoma and MCF7 cells. This construct gives 
high transient expression in COS cells, high to medium 
levels of expression in MCF7 cells and medium to low 
levels of expression in HepG2 cells. To visualise 
UNC-53, tubulin and F-actin, transfected cells were 
stained with various combinations of the anti-Ce-UNC- 
53 mab 16-48-2, rabbit anti-UNC-53 polyclonal, anti- 
tubulin mab YL1/2 and f luorescently labelled 
phalloidin. 

At high levels of expression UNC-53 co-localises 
with the entire microtubule cytoskeleton, but at lower 
expression levels UNC-53 signal is restricted to the 
terminal regions of the microtubules at the plus ends. 
Very low levels of the expression yield a dot-like 
pattern in the vicinity of the cortex of the cell. 

To map the MTB plus end domain of Ce-UNC53, we 
made two constructs pcDU2 (figure 17) and pcDU3 (figure 
15) in which the aminotermus of Ce-UNC-53 is deleted. 
Proteins corresponding to these constructs are thought 
to be made in vivo from different unc-53 promoters. 
Transient transf ections followed by immunolocalisation 
showed these proteins to be cytoplasmic. In stable 
transf ections in N4 neuroblastoma cells and MCF7 cells 
they were shown to be no longer toxic to cells but 
cause highly increased activation of filopodia 
formation. We thus uncoupled (1) toxicity of Ce-UNC- 
53 from activation of motility and (2) microtubule 
binding from the activation of motility. 
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Analysis of the microtubule aaaocia tion of th» 

Ciglgqang and Human l UNC53 

To isolate the microtubule association domain of 
the C.elegans UNC53 , N-terminal GFP fusions were made. 
C-terminal deletions on the fusion product revealed 
that the microtubule association was localised in the 
N-terminal half of the protein. A GFP fusion was also 
constructed with the Humanl-UNC-53 , to analyse the 
microtubule association properties of this protein. 
The association with microtubules was confirmed. A 
mouse anti sera was used to show the presence of 
native Unc-53 on microtubule plus ends of melanoma 
line G361. The epitope recognition of the antibody 
was confirmed by immunohistology experiments with 
mammalian cells, transiently expressed with pLM4 , 
expression the GFP-hul-UNC53 fusion protein. 

Results 

1. When transiently transfecting pTB72 in 
several cell lines c.elegans UNC-53 associates with 
microtubules and preferentially the plus-ends of the 
tubuline fibres. Transfection of plasmids pCDU3 and 
PCDU2 in N4 and MCF7 cell lines did not result in the 
observation of microtubule co-localisation. pCDU4 
resulted in no staining using mab 16-48 antibody (LMBP 
Accession No. 1383CB) concluding that the epitope for 
this antibody is localised outside the fragment 
expressed by pCDU4 . 

It is possible that the microtubule associated 
domain is situated in the N-terminus of the protein. 
For this reason, we constructed an N-terminal GFP 
fusion with the full length C.elegans UNC-53 sequence, 
and various C-terminal deletion derivatives. These 
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fragments encode the N-terminal part of UNC-53 from 
139 to 760 aa. 

Furthermore, to analyse if the cloned fragment of 
hul-unc53 also could be associated with microtubules, 
5 a plasmid encoding a GFP fusion with the hul-Unc53 

protein was constructed, and introduced into mammalian 
cells. A derivative of this construct was also 
constructed. 

10 2. 

a) Transient expression of C.elegans Unc-53 GFP 
fusion in N4 neuroblastoma lines 

N4 cells where transiently transfected with 

15 pEGFP72, encoding a fusion protein of GFP and full 
length C.elegans unc-53 sequence. On an inverted 
microscope, the fluorescence of the GFP molecule could 
be followed in living cells. Cells which expressed 
low to medium levels of the fusion molecule showed a 

20 normal morphology after 18h to 30h. In these cells 

the co-localisation of the GFP fusion protein with the 
microtubules could clearly be demonstrated (figure 
38a) . In cells which demonstrated a low but still 
distinct GFP fluorescence, specific microtubule plus- 

25 end association could be observed (figure 38b) . Cells 
expressing high levels of the GFP fusion protein tend 
to round up, in such a way that the microfilaments are 
difficult to visualise. After 48h, almost no GFP 
expressing cells can be found. It has previously been 

30 observed in transient expression of Unc-53, using 
plasmid pTB72, that the protein is toxic for the 
cells. The transient transfection experiments with 
the pEGFP72 plasmid gives the same observation, 
indicating that at least two features of the Unc53 

35 protein are conserved in the GFP fusion protein, being 
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the microtubule association and the toxicity of the 
protein ♦ 

The transfected cells were fixed with 
paraformaldehyde, and the tubuline was stained using 
5 antibody YL1/2 and antimouse-CY3 (Jackson Labs) . 

Although a significant loss of GFP fluorescence was 
observed, one could clearly demonstrate that the 
filaments observed with the GFP fluorescence co- 
localise with the microtubules staining (figure 39) . 

10 

PMt^tive assay 

Mammalian cells, in this case N4 , were 
transfected with a lipofecting agent ( lipof ectAMINE) 

15 while in suspension, not being attached to a surface. 
After transfecting those cells with pEGFP72, the 
transfected cell suspension could be diluted in 24- 
and/or 96-well plates, enabling them to attach ot the 
surface. Each well may contain a different compound 

20 of the collection to screen. After 24h, plates could 
. be automatically screened for fluorescence levels • 
Wells containing a compound that abolish the toxicity 
of the GFP-C.elegans UNC-53 fusion protein will give 
high levels of fluorescence. Compounds having no 

25 effect on the fusion product will give no or only low 
levels of fluorescence. 



b) Transient expression of the truncated GFP- 
30 C.elegans UNC-53 fusion proteins. 

To assay if the microtubuline association 
did occur in the N-terminal part of the C.elegans Unc- 
53 protein, various C-terminal deletions were 
constructed. 

35 Transfection of pEGFPsma and pEGFPecl coding 
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for 760 AA and 670 of the N-terminal part of C.elegans 
UNC-53 in fusion with GFP, resulted in microtubuline 
association, as could be visualised in living cells* 
The association with the microtubules is less abundant 
5 than observed when expressing the full length 

C.elegans UNC-53 protein, but fibres could clearly be 
observed ( figures 4 0a and 41a) . More background 
fluorescence is seen. This could be due to a lesser 
association to the microtubules or to a instability of 

10 the fusion protein. The association with microtubules 
could not be observed after fixing the cells with 
paraformaldehyde nor with methanol fixation, giving an 
extra indication for the weak association with the 
microtubule network of these proteins or potential 

15 instability of the fusion protein. At low expression 
levels the association of the GFP fusion protein with 
the centrosomes could clearly be detected (Figures 40b 
and 41b) . Centrosomes are the location in the cell 
with the highest microtubule concentration. 

20 

No plus-end associations could be observed 
with the deletion constructs, even when cells where 
expressing low levels of the GFP fusion proteins. In 
the case of very low expressions, the centrosomes 
25 could clearly be detected. 

When transfecting N4 cells with pEGFPsac or 
pEFPXba, coding for 139 aa and 256 aa of the N- 
terminal part of C.elegans UNC-53 in fusion with GFP, 
30 no microtubule association could be observed. This 
indicates that at least 670 aa of the N-terminus of 
the C.elegans UNC-53 is needed to have microtubule 
association (figures 42a and 42b). 

35 c) Transient expression of the GFP-hu-UNC-53/1 
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fusion proteins and a deletion derivative. 

Plasmid pLM4 was transiently transfected 
into N4 neuroblastoma cells, and GFP fluorescence was 
5 observed in living cells. GFP fluorescence of the 
available sequence of hul-UNC-53 in fusion with GFP 
was localised at the microtubule level. Moreover, at 
lower expression levels, both the centrosomes, and 
specific plus-end association could be observed. As 

10 has been observed with the C.elegans UNC-53 

derivatives in fusion with GFP, expressed by the 
plasmids pEGFPsma and pEGFPecl, the GFP association 
seems to be less tight as was observed by the full 
length C.elegans UNC-53 fragment in fusion with GFP. 

15 The observed instability of the fusion protein can be 
due to a lesser association to microtubules, or to a 
degradation of the fusion protein (figure 43) . 



d) Immunofluorescence on melanoma line G361, 
and on neuroblastoma line N4 transiently transfected 
with pLM4. 

Introduction 

Northern experiments show that the melanoma 
cancer line G361 expressed abundantly both the Humanl 
and Human2 homologue of C.elegans UNC-53. To test if 
the proteins could be localised in this cell line, a 
collection of mouse sera was tested on this cell line. 
To verify if the observation was due to a hu-UNC-53 
recognition, and not to an artifact, a positive sera 
was applied to N4 cells transiently transfected with 
pLM4, expressing the GFP-hul-Unc fusion. 
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a serum, designated 28.1 from a mouse previously 
injected with peptide (DNRTLPKKGLYRY) a conserved 
sequence of the UNC-53 family was used for a 
immunolocalisation experiment on G361 cells fixed with 
paraformaldehyde. Antimouse-cy3 was applied as second 
antibody. Association with microtubule plus-end could 
clearly be observed. Moreover, in cells showing 
directional movement, observed as growth cones 
extensions, abundant staining can be seen in the tip 
of the growth cone (figure 45) . To test whether the 
recognition of the microtubule associated protein was 
identical to the Hul-UNC-53 protein, N4 cells were 
transiently transfected with plasmid pLM4 and 
consequently fixed with paraformaldehyde and stained 
with serum 28.1. Only cells that were transfected 
showed staining with 28.1, indicating that the 
antibody of 28.1 recognised the Hul-UNC-53-GFP fusion 
protein (figure 46) . This confirms that the staining 
of the microtubule plus-ends in the growth cones of 
G361 by serum 28.1 is due to a recognition of at least 
the Humanl and/or the Human2 homologue. It is 
concluded that the overexpression of the human 
homologue of C.elegans UNC-53 in the melanoma 
cancerline G361 is located on the microtubule plus- 
ends. 

Conclusions 

a) - GFP-C. elegans UNC-53 fusion protein 
expressed by pEGFP72 shows Unc53 activity 

b) - GFP-C. elegans UNC-53 fusion protein 
expressed by pEGFP72 shows microtubule association 

c) - GFP-C. elegans UNC-53 fusion protein 
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expressed by pEGFP72 shows microtubule plus-end 
association 

c) - GFP-C.elegans UNC-53- (deletion variant) 
fusion proteins expressed by plasmids pEGFPsma and 

5 pEGFPecl show microtubule association. 

d) - GFP-C.elegans-UNC-53- (deletion variant) 
fusion proteins expressed by plasmids pEGFPsma and 
pEGFPecl no not show microtubule plus-end association 

e) - GFP-C.elegans UNC-53- (deletion variant) 
10 fusion proteins expressed by plasmids pEGFPxba and 

pEGFPsac no not show microtubule associations. 

f) - GFP-hul-UNC-53 fusion protein expressed by 
plasmid pLM4 shows microtubule association. 

g) - GFP-hul-UNC-53 fusion protein expressed by 
15 plasmid pLM4 shows microtubule plus end association. 

i) - serum 28.1 recognises the Hul-UNC-53-GFP 
fusion protein as expressed by plasmid pLM4 in 
transiently transfected Neuroblastoma cells N4 . 

j) - the expressed human homologue of C.elegans.- 
20 UNC-53 in melanoma line (being at least hul-Unc-53) is 
associated with the microtubule plus-ends. 

EXPERIMENTAL PROCEDURES 
25 Materials 

The oligonucleotides used in the PCR-RACE 
experiments were synthesised by Eurogentee (Belgium) . 
Radioactive compounds were obtained from Amersham. 
30 The pCDNA3 . 1 eukaryotic expression vectors, human 
1GT10 cDNA libraries, marathon-RACE cDNAS , human, 
Northern blots and the T7-tag monoclonal antibody were 
purchased from Invitrogen. N4 , MCF7 and NIH 3T3 cells 
were retrieved from the Janssen Research cell bank. 
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PCR-RACE conditions 

1. A quick screen human cDNA library panel was used 
to amplify EST clone gb..R41071. The primers used 
were ESTfw 5 1 -AATGGCTTCCTGGTTACCTGAG-3 1 and ESTrv 5»- 
CAAGTCAGCACCCCGAAGCAGCTCT-3 1 . Human genomic DNA was 
used also as template ( lOOng/reaction) . The 
amplification conditions were as follows: 1 min at 
94°C, 30 sec at 55°C / 30 sec at 72°c, then 35 more 
times and a final extension of 20 min at 72°C. This 
PCT fragment was cloned in vector pCR2.1. The 
resulting plasmid was designed pCR231. 

A human heart clone was also produced by RACE-PCR 
from a human heart Marathon cDNA using the following 
conditions; 1 min at 94 C, 30 sec at 70°C, 3 min 30 sec 
at 72 C, then 35 more times and a final extension of 
20 min at 72 c KlenTaq DNA Polymerase was purchased 
from Invitrogen. 

For the mouse homologue, total RNA was obtained 
from N4 murine cells as described. A first strand 
cDNA was synthesized from 2 pgr of RNA using Ready To- 
go cDNA kit (Pharmacia) The primers used were M-ESTfw 
5 1 CCTCTGTGGGCACCGAGGTCACC — 3». The amplification 
conditions were as follows: 1 min at 94°C / 30 sec at 
58°C, 30 sec at 72 ;, C, then 35 more times and a final 
extension of 20 min at 72'C. All the amplifications 
product were subcloned in pCRII-1 and several 
independent clones were analyzed by sequence. 

2. Screening of Human Heart/Colorectal Adenocarcinom a 
cDNA library 

A human heart cDNA library and a human colorectal 
adenocarcinoma cDNA library were screened using 
pCR231bp as probe by the standard plaque hybridization 
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method. The screening produced several positive 
clones in each library called respectively XHH3, AHH4 , 
XHH15 , ACAD 14 and XCAD27. The positive phages were 
purified by two additional rounds of plaque screening 
5 and were then amplified. 

3. 5 1 extension using PCR 

Three primers with homology to the 5' end of 

10 clone XHH3b were made: 

HU53rvl ( 5 1 -cct-ggg-act-gaa-gct-ggt-acc-tga-gcc-3 • ) , 
HU53rv2 ( 5 1 -ttg-gga-aga-gtg-ttc-cga-tcc-cgc-tg-3 • ) and 
HU53rv3 ( 5 1 gtt-gcc-cag-ctc-tgg-ggc-ttc-cac-tcc-3 1 ) and 
used together with XgtlOrv primer ( 5 1 -gag-gtg-gct-tat- 

15 gag-tat-ttc-ttc-cag-ggt-a-3 1 ) in three nested PCR 

reactions on a cDNA amplified library from Human Heart 
(Clontech) . The reaction mixes contained 25pmol of 
each primer, 1 mM of each dNTP, 1 julKlenTaq Polymerase 
Mix (50x) and 0,1 ng DNA . The cycling parameters for 

20 the first PCR were: 3 min at 94°C / 35 cycles of 1 min 
at 94°C / 1 min at 51°C and 3 min at 72°C and a final 
extension of 10 min at 72°C, using HU53rvl and XgtlOrv 
as primers. 0.4 jul of this primary PCR product was 
amplified using HU53rv2 and XgtlOrv as nested primers 

25 with the following parameters: 3 min at 94°C, 38 

cycles of 1 min at 94°C, 1 min at 52°C and 3 min 30 
sec at 72°C and a final extension of 10 min at 72°C. 
The second nested PCR reaction was performed on 0.4 /zl 
of a 1/50 diluted purified 2.4 kb fragment using 

30 HU53rv3 and XgtlOrv as primers: 3 min at 94°C, 35 

cycles of 1 min at 94°C, 1 min at 56°C and 3 min 30 
sec at 72°C and a final extension of 10 min at 72°C. 
A 774 kb amplification product was subcloned in 
pCR2.1, resulting in plasmid pCB2 10-14. The clone 

35 fragment was analyzed by sequencing. This fragment 
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extends 699 bp in 5 ■ direction (see fig 9). 

4. 5' extension using PCR 

Primer HU53rv4 (5 1 -ccc-tgc-ttg-gtg-ctg-agg-aga- 
ctg-g-3 1 ) was designed on the 5 1 end of clone pCB210- 
14 and was used together with AgtlOrv to amplify a 
fragment of the Human Heart cDNA library with the 
following parameters: 3 min at 94°C, 35 cycles of 1 
min at 94°C, 1 min at 60°C and 3 min 30 sec at 72°C 
and a final extension of 10 min at 72°C. A 887 bp 
fragment was subcloned in pCR2.1, resulting in plasmid 
pCB212. The clone fragment was analyzed by 
sequencing* This fragment extends a further 767 bp in 
5* direction (see fig 9). 

5. Human Heart Library screening using the 0.8 kb 
insert of pCB212 as probe 

The EcoRI digested and purified clone pCB212 was 
used as probe to screen the Human Heart cDNA library 
(Clontech) using standard plaque hybridization method. 
The positive phages were purified by two additional 
rounds of plaque screening. The insert of the XDNA 
(produced using Qiagen Lambda Kit) was analyzed by 
sequencing. This pHH14-3 resulted in a 2663 bp 
fragment overlapping pCB212, pCB210-14 and the 3' end 
(434 bp) of XHH3b and in a 761 bp 5 1 extension (see 
fig 9). 

3' and 5' extension of HU-Unc53/2 from EST46037 



35 



WashU-Merck EST 46037 

Transformed cells carrying the EST 46037 sequence 
were ordered from Research Genetics. Plasmid DNA was 
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isolated using standard protocols (Qiagen plasmid DNA 
isolation kit) , the sequence of the insert was 
determined. 

3' extension <?t EST 46Q37 fry RACE 

Marathon-Ready cDNAs (Clontech) are premade 
"libraries" of adaptor-ligated double-stranded cDNA 
ready for use as templates in RACE experiments. 
Five ml Marathon-Ready cDNA was used as template in a 
regular 50ml RACE. The RACE mixture contained lx 
KlenTaq PCR buffer, 0.2 mM of each dNTP, lx advantage 
KlenTaq polymerase mix (Clontech), 0.15 mM API adaptor 
primer and 0.15 mM RACE gene specific primer. The 
amplification conditions were as follows : 

94°C for 1 min, 5 cycles of 94°C for 30 s and 72;: C for 
4 min, 5 cycles of 94°C for 30s and 70°C for 4 min, 25 
cycles of 94 P C for 30 s and 68°C for 4 min. 
One-hundred-fold diluted RACE product was used as a 
template in a nested PCR with AP2 adaptor and gene 
specific nested PCR primers. Specific nested PCR 
fragments were cloned into pCRr2 . 1 (TA cloning kit, 
Invitrogen) and the sequences of the inserts were 
determined. 

gene specific primer (EST46037-F1) 
5 1 AGTGAGAACAATGCTGTGGACATGC nested gene specific 
primer (ES4 60 3 7-F2) 5 » CTGCTCAACTGCAAGTACCACAAATGC 
Marathon cDNA library : human placenta 

WashU-Merck EST 923793 



35 



Transformed cells carrying the EST 923793 
sequence were ordered from Research Genetics. Plasmid 
DNA was isolated using standard protocols (Qiagen 
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plasmid DNA isolation kit) , the sequence of the insert 
was determined. 

RACE fragments 1.4 and 3.7, 5' extension of 
5 EST46037 

Method as described previously. Gene specific 
primer (EST46037-R1) 5 1 ACTGCCTTGAGACTCTGACTTCAGC 
nested gene specific primer (ES46037-R2) 
10 5 •TGGGCAGAACTGAGAGCTTCTAAGC Marathon cDNA library : 
human placenta 

RACE fragments B2.1. D2.1. H2.1; 5' extension 

15 Method as described previously : gene specific 

primer (97010709) 5 1 ATTCTTTTGCATCTTCTTGCGTGCG 
nested gene specific primer (97010708) 

5 •ACCTGAGTCCTTTCTTAGGCAAAGTGTTCC Marathon cDNA library 
: human placenta (fragment B2.1) 

20 

human HeLa S3 (fragment D2.1) human colorectal 
adenocarcinoma SW480 (fragment H2.1) 

PCR fragments E2 . 3 , C2 . 3 

25 

EST 485068 is similar to but not identical with 
the 5'end of HU-Unc53/l. A primer pair consisting of 
one 3* EST 485068 primer and one 5* HU-Unc53/2 primer 
were used to PCR amplify those fragments. lgtlO human 

30 placenta Quick screen library (fragment C2.3) or 

Marathon cDNA from human HeLa S3 (fragment E2.3) were 
used as templates in a PGR- A 50 ml reaction mix 
contained IxPCR II buffer (Perkin-Elmer) , 1.5 mM 
MgC12, 0.2 mM of each dNTP, 0.15 mM forward and 

35 reverse primer, 2.5 U AmpliTaq Gold (Perkin-Elmer) 
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and 1 ml template. The cycling parameters were 5 
minutes at 95°C, 3 5 cycles of 45 seconds 
at 94°C / 45 seconds at 65°C and 2 minutes at 72°C. 
The PCR products were sliced out from an agarose gel 
5 and purified using a gel extraction kit (Qiagen) , one 
ml hereof was used in a second round PCR using the 
same conditions as above. The PCR products were 
purified (Qiagen PCR purification kit) and direct 
sequenced. 

10 

primers : 

(97010709) 5 1 ATTCTTTTGCATCTTCTTGCGTGCG 
(97012802) 5 1 CGCTCCCCATCAG ATGCAGGCCGG 

15 PCR fragment El. 3-3 

EST 01222 is homologous but not identical with 
the 5' end of HU-Unc53/l. A primer pair consisting of 
one 3' EST 01222 primer and one 5' HU-Unc53/2 primer 

2 0 were used to PCR amplify this fragments, 

Marathon cDNA from human HeLa S3 was used as template 
in a PCR. A 50 ml reaction mix contained lxPCR II 
buffer (Perkin-Elmer) , 1.5 mM MgC12 , 0.02 mM of each 
dNTP, 0.15 mM forward and reverse primer , 

25 2.5 U AmpliTaq Gold (Perkin-Elmer) and 1 ml template. 
The cycling parameters were 5 minutes at 95: C, 35 
cycles of 45 seconds at 94°C, 45 seconds 
at 65°C and 2 minutes at 72 °C. The PCR products were 
sliced out from an agarose gel and purified using a 

30 gel extraction kit (Qiagen) , one ml hereof was used in 
a second round PCR using the same conditions as above. 
The PCR products were analysed on an agarose gel, the 
fragment of interest was sliced out, purified (Qiagen 
PCR purification kit) and cloned into 

35 pCRr2.1. The sequence of the insert was determined. 
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RACE fragments A2.2-2. B2.1-4. D2.1-5! 5' 

extension 

Method as described previously, 
gene specific primer (97041701) 

5 'TATGCTACGGCCACTCATCTCCGTGG 

nested gene specific primer (97041702) 

5 1 TGTAACCTGAGTTCCCCTTAAACTGG 
Marathon cDNA library : 
human placenta (fragment A2.1-2) 
human HeLa S3 (fragment B2.1-4) 

human colorectal adenocarcinoma SW480 (fragment 
D2.1-5) 

Translation-initiation splice variants, fragments 
D4.1-1, J4.1.4, G4.1.1, F4.1-2 

Four different translation initiation slice 
variants were detected by 5 'RACE. 

Method as described previously, 
gene specific primer (97080803) 

5 1 TCGGTTGTTAGCAGTAGTTGACCCTCC 

nested gene specific primer (97080804) 

5 1 ACCTGAAAGTCTGGACTGCATTTCAGC 
Marathon cDNA library : human colorectal 
adenocarcinoma SW480 (fragment D4.1-1) gene specific 
primer (97080801) 

5 1 ACAACCTGGATAATCTGGGCCAGGAGG 
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nested gene specific primer (97080802) 

5 1 TCTTGCTGGAGATCCTTGATGAGACGC 
Marathon cDNA library : 

5 

human melanoma G361 (fragment J4.1.4) 
human HeLa S3 (fragment G4.1.1) 
human placenta (fragment F4.1.2) 

10 DNA sequencing 

PCR amplification products and cDNA clones were 
subcloned either into pBluescript vectors (Stratagene) 
or in PCR-IIa vector (Invitrogen) and sequenced either 
15 manually by the dideoxynucleotide chain termination 
method with modified T7 DNA polymerase (Sequenase, 
United States Biochemical) or automatically with an 
Applied Biosystems 373 DNA sequencer using the 
fluorescent terminator kit (Perkin Elmer) . 

20 

RNA blots 

A Human multiple tissue Northern (MTN-1, 
Clontech) containing in each lane 2 mg of poly A + RNA 

25 from eight different human tissues (heart, brain, 

placenta, lung, liver, skeletal muscle, kidney, and 
pancreas) and a MTN-II human multiple tissue Northern, 
containing in each lane 2 mg of poly A + RNA from 
spleen, thymus, prostate, testis, ovary, small 

30 intestine, colon and peripheral leukocyte, were 
hybrydized according to the manufacturer's 
instructions and washed out in 0.lxSSC:0.2% SDS at 
55'C. Also from Clontech, a poly A + RNA blot from 
human cancer cell lines (melanoma G361, lung carcinoma 

35 A549, colorectal adenocarcinoma SW480, Burkitt's 
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lymphoma Raji Leukemia Molt 4, lymphoblastic leukemia 
K562 / HeLa S3 and promyelocytic leukemia HL60) was 
tested. 

Construction <?f alaamida 

Plasmid pCDU2 (Figure 17) was constructed by 
cloning the 2.8 kb Apal-Narl fragment from pTB72, the 
latter restriction site made blunt with klenow enzyme, 
into pcDNA3, digested with EcoRV and Apal . pCDU2 
encodes for the homology blocks A, B, C, D and E. 
Plasmid pCDU3 (Figure 15) was constructed by cloning 
the 1.9 kb Apal-Ndel fragment from pTB72, the latter 
restriction site made blunt with Klenow enzyme, into 
pcDNA3, digested with EcoRV and Apal, pCDU3 encodes 
for the homology blocks C, D and E. Plasmid pCDU4 
(Figure 16) was constructed by cloning the 1.4 kb 
Apal-Styl fragment from pTB72, the latter restriction 
site made blunt with Klenow, into pcDNA3 digested with 
EcoRV and Apal. pCDU4 encodes for the homology block 
E. 

Expression of a domain of the human UNC53 in 
eukaryotic cells 

1. pCB20l: Equivalent construct of human 1 
homologue to expression construct pCDU4 of C. elegans 
unc-53 gene cloned in a eukaryotic His-tag, Xpress Ab 
tag expression vector. 

A suitable Bam HI site was engineered on pHH15 
open reading frame by amplification with hhl5fw primer 
5 1 AGAGCGGATCCATATGCCTCCTTGCCGTCAAGGTG-3 1 and M13rv 
primer ( 5 ' -cag-gaa-aca-gct-atg-ac-3 ' ) . The amplified 
fragment was then moved to pCDNA3 . 1 . His-A-Vector 
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digested with BamHl and EcoRI . This new plasmid 
called pCB2 01 (Figure 13) produces a cDNA which codes 
for a fusion protein consisting of a 49 amino acid 
aminoterminal fragment containing an His-tag and also 
5 a T7 epitope tag followed by amino acids 1255 to 1627 
of the sequence of the human homologue. pCB2 01 was 
also checked by sequence and the n was used in stable 
transfection experiments carried out in N4, MCF7 and 
NIH3T3 cells. 

10 

2. pLM5: Equivalent construct of human 1 
homologue to expression construct pCDU3 cloned in an 
eukaryotic His-tag, Xpress Ab tag expression vector. 

15 The phage HH3b was linearized using Xhol . A 

BamHI and Xbaal restriction site were created on the 
pHH3b open reading frame using U3-Bfw (S'-cca-cac-tag- 
ggg-atc-cat-gca-aat-gag-g-3 1 ) and U-rv (5 , -caa-aag- 
tct-cta-gag-gag-gcc-agt-3 1 ) as primers. This 

20 amplified fragment was then moved to pBluescript KS, 
digested with BamHI and Xbal. Sequencing of this 
plasmid, named pCB300, showed an amino acid change 
from a serine to an asparagine due to a change from 
guanine to adenine on the position 4237 of the DNA 

25 sequence. This fault was repaired by cloning a 1418 
bp fragment of pLMl (see below) (using Narl and Xbal 
as enzymes) into pCB300 digested with the same 
enzymes. The phage HH3b fragment of this plasmid, 
named pLM6 (fig 53), was then removed using BamHI and 

30 Xbal, to pcDNA3 . 1/HisA digested with the same enzymes. 

This new plasmid, named pLM5 (fig 52) , produces a cDNA 
which codes for a fusion protein consisting of a 49 
amino acids aminoterminal fragment harboring a His-tag 
and a T7 epitope tag, followed by aminoacid 1069 to 

35 1627 of the transcript of HU-Unc53/l. Plasmid pLM5 was 
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checked by sequencing and used on transient and stable 
transfection experiments carried out in N4 cells. The 
plasmid pLMl was created using a PvuII and partial 
BamHI digested fragment of pHH14-3 and a BamHi and 
5 Spel digested fragment of phage HH3b, cloned into 

pBluescript KS digested with Smal and Spel. The pLMl 
contains the full transcript of HU-UNC-53/1 available 
at this moment (see fig 9) . 

10 3.pCB251: Equivalent construct of human 1 

homologue to expression construct pCDU2 cloned in an 
eukaryotic His-tag, Xpress Ab tag expression vector 

The phage HH3b was linearized using Xhol . A 

15 BamHI and Xbal restriction site were created on the 
pHH3b open reading frame using U2fw (5 1 -aag-gga-tga- 
ttc-ggt-cag-gat-cct-tc-3 • ) and U-rv ( 5 * -caa-aag-tct- 
cta-gag-gag-gcc-agt-3 f ) as primers. The amplified 
fragment was then moved to pCR2.1. This plasmid was 

20 named pCB250. The pHH3b fragment was removed from 
pCB250 using BamHI and Xbal and cloned in 
pcDNA3 . 1/HisC digested with the same enzymes. This 
plasmid, named pCB251 (figure 55), was checked by 
sequencing. pCB2 51 produces a cDNA which codes for a 

25 fusion protein consisting of a 49 amino acid 

aminoterminal fragment harboring a His-tag and a T7 
epitope tag, followed by amino acids 828 to 1627 of 
the partial transcript of HU-Unc53/l. pCB251 was used 
on transient and stable transfection experiments 

30 carried out in N4 cells (see fig 56) . 



4. pLM3 : the partial transcript of HU-Unc531 
cloned in an eukaryotic His-tag, Xpress Ab tag 
expression vector 
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pLMl was digested with EcoRV and Xbal. This 
fragment was cloned in pcDNA3 . 1/HisB , digested with 
the same enzymes. pLM3 produces a cDNA which codes 
for a fusion protein consisting of a 49 aminoacid 
5 aminoterminal fragment harboring a His-tag and a T7 

epitope tag, followed by amino acids 1 to 1627 of the 
transcript of HU-Unc53/l available at this moment. 
pLM3 was used on transient and stable transfection 
experiments carried out in N4 cells. 

10 

5. pLM4: the partial transcript of HU-Unc53/l 
cloned in an eukaryotic GFP expression vector 

pLMl was digested with Clal and Xbal. This 
15 fragment was cloned in pEGFP-cl, digested with AccI 

and Xbal. This plasmid was named pLM4. This plasmid 
produces a cDNA which codes for a fusion protein 
consisting of GFP, followed by aminoacid 1 to 1627 of 
the transcript of HU-Unc53/l. pLM4 was used on 
20 transient and stable transfection experiments carried 
out in N4 cells (see figs 43 and 46) . 

Stable transfection of MCF-7 cells: 

25 

Cells were seeded at a density of 2xlO to cells in 
a 75 cnr flask using standard culture medium 
((Dubecco's MEM, 450 mg/1 glucose, 862 mg/1 L-Alanyl- 
L-Glutamin, 110 mg/1 Na-pyruvate; GibcoBRL) 

30 supplemented with 10% foetal calf serum (FCS; 

GibcoBRL) , and 100 U/ml penicillin (GibcoBRL) and 100 
A/g/ml streptomycin) • The culture was grown at 37°C in 
a 10% CO_. atmosphere, to approximately 70% confluency 
(approximately 18 hours) . The culture medium was 

35 removed and 10 ml MEM-HEPES (GibcoBRL) supplemented 
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with 10% FCS was added to the cells. The culture was 
further incubated for four hours at 37°C in standard 
sterile air. DNA-CaCl 2 was meanwhile prepared by 
mixing 3 0 DNA in 0.1 x TE (1 mM Tris. Hcl, pH 7.2, 
5 0.1 mM EDTA, pH 8) and 0.1 ml 1.25 M CaCl./HEPES (1.25 
M CaCl 2/ 0.125 M HEPES; pH 7.05). 0.1 x TE was added 
to a final volume of 0.5 ml. The DNA-CaCl 2 was added 
drop by drop to 0.5 ml BS/HEPES (25 mM HEPES, 0.25 M 
NaCl, 0.01 M Kcl, 1 . 4 mM Na_HP0 4/ 0.01 M glucose, pH 

10 7.05) while pipeting a sterile airflow through the 
latter solutions. The DNA-Ca < (P0 4 ) > precipitate was 
then placed at 37°C for ten minutes. The DNA-Ca,, (P0 4 ) 2 
precipitate was vortexed and added to the cells, 
together with 100 ful of a 0.01 M chloroquine (Sigma) 

15 stock in H c O. After four hours of incubation at 37°C 
in sterile standard air, the medium was removed, and 
the cells were washed with PBS (GibcoBRL) . 25 ml of 
medium was added and the cells where placed at 37°C in 
a 10% CO : atmosphere. After 48 hours of incubation, 

20 the cells were harvested, diluted and cultivated under 
selection (600 /ig/ml G418 (Duchefa) ) for two weeks 
prior to clone selection. Mock transfected MCF-7 were 
positive for the beta-galactosidase transgene. The 
stability of transfection in MCF-7 was assessed by 

25 passaging cells four times in the absence of Geneticin 
and then re-exposing them to the selector agent. In 
these experiments, unc-53 or mock transfected cells 
proliferated, whereas untransf ected MCF-7 cells 
proliferated at a much slower rate. 

30 

Stable transfection of N4 neuroblastoma cells 



35 



Cells were seeded at a density of 2x10" cells in 
a 25 cm" flask using standard culture medium ( (MEM 
Rega 3; GibcoBRL) supplemented with 10% FCS, 0.14% 
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Na^COj, 

2 mM glutamine, 100 U/ml penicillin, and 100 nq 
streptomycin) . The culture was grown overnight at 
37°C in a 10% CO a atmosphere. Transfection mixture was 

5 prepared by adding 12 ^g DNA in 600 jul optimem 

1 (GibcoBRL) to 36 fxl Lipofect AMINE (GibcoBRL) in 600 

optimem 1. This was done by adding drop by drop 
the first solution to the second. The mixture was 
placed for 3 0 minutes at room temperature, after which 
10 1.8 ml of optimem 1 was added. In the meanwhile the 
cell culture was washed twice with optimem 1, and the 

3 ml of transfection mixture was added. The culture 
was placed at 37°C in sterile standard air. After 
four hours, 3 ml or normal culture medium was added 

15 and the culture was placed at 37 C under 10% of C0 2 . 

18 hours later, the culture was washed with PBS, and 
fresh normal culture medium was added. A further 24 
hours later, the cells were harvested, diluted and 
cultured under selection (750 A/g/ml G418) for two 

20 weeks prior to clone selection. 

Fixation Pf cells fpr immunof lyor<?gp<?n<?e 

Medium was removed from the 9 cm 2 wells 
25 containing the coverslips. A 4% solution of 

paraformaldehyde (Sigma) in PHEM (1 g/1 glucose, 0.4 
g/1 Kcl, 8 g.l NaCl, 0.06 g/1 KH : PO, , 0.0475 g/1 
Na.;HP0 4 , 0.35 g/1 NaHCO<, 1.51 g/1 PIPES, 0.76 g/1 
EGTA, 0.19 g/1 MgCl_ ; pH 6) was added for 30 min at 
3 0 room temperature. The fixative was removed, and the 
coverslips were washed three times 10 minutes with 
PHEM. The coverslides were then placed in PHEM, 
containing 0.5% Triton-XlOO (Serva) for 30 minutes, 
after which th slide was washed again for three times 
35 10 minutes with PHEM. The coverslips were then placed 
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under PBS (0.14 M NaCl, 2.7 mM Kcl, 10 mM Na 2 HP0 4 , 1.8 
mM KH 2 P0 4/ pH 7.3) containing 0.2% Tween (Sigma) for at 
least one hour at 4°*C 

5 Inmmnpf lupreggenge staining 

The coverslips were inverted on 35 //l of 
appropriately diluted antibody, being YL 1/2 for 
tubulin and/or mab 16-48-2 monoclonal or anti-UNC53 

10 (gp48) polyclonal antibody for UNC53. The slides were 
placed at 4 : C for at least 18 hours. Excess of 
primary antibody was then removed by washes of three 
times ten minutes in PBS-Tween. The slides were then 
treated with secondary antibody in the same way as for 

15 the primary antibody. F-actin was labelled by 

including TRITC- or FITC coupled phalloidine to the 
incubation buffer. The inverted slides on the 
secondary antibody were left at room temperature for 
approximately one hour. Slides were then washed again 

20 for three times ten minutes with PBS-Tween and once 

with PBS. The coverslips were mounted on slides with 
the medium described by Herzog et al. (Cell Biology: 
a laboratory handbook, 1994, Academic Press, 355-360). 
After at least two hours, slides were ready for 

25 analysis. 

Tims lapge ^naiygjg 

Analysis of the behaviour and movement of growing 
30 cell cultures was done by placing a non-confluent 
culture under a phase contrast microscope equipped 
with a temperature controlled stage (37°C) . Images 
were recorded using a CCD camera (COHU 4912) coupled 
to a SCION LG3 framegrabber in a Macintosh ppc 8100 
35 running NIH image version 1.60. Images were recorded 



WO 98/24810 PCT/EP97/06956 

- 88 - 



at time intervals, varying from 15 sec to 1 min. for 
half an hour to two hours. Image enhancement and 
playback was done in NIH image. 

5 PhagpKinegjg 

A variety of cell types were shown to migrate 
over colloidal gold coated culture plastic or glass 
and displace or phagocytose the gold lawn on their way 

10 while locomoting. The track left bare is a 

qualitative and quantitative measure of cell motility 
and/ or locomotion. The basic methods have been 
described in detail elsewhere (Albrechr-Buehler , 1977 , 
Cell, 11: 395, Zetter, 1980; Nature, 285: 41; O'Keefe 

15 et al., 1983; J. Invest. Dermatol., 85: 130). Culture 
plates were gelatin and gold coated as described by 
Albrecht-Buehler (1977) . Unc-53 and mock transfected 
MCF-7 were seeded in plates at low density and allowed 
to adhere to the plate and to locomote overnight. 

20 Cells were chemically fixed to the plate, washed and 
air-dried. Images of the gold lawns were captured 
using automated videomicroscopy ; composite images of 
the wells were generated and single-cell phagokinesis 
tracks were measured using a home-made routine in 

25 SCIL™ software. 

C, eleaans-UNC-53 preferentially binds 
microtubule plus en ds or GTP-tubulin 

3 0 1. Cloning of c.elegans cDNA in pEGFP-cl and 

construction of C-terminal deletion derivatives. 



35 



a) Constructing a gFP-Un<?53 N-terminal fusion; 
A PCR experiment was performed under 
standard conditions, using pTB72 as template and cpl7 
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(ata gcc aga tct acg tea aat gta gaa ttg) and cpl8 
(ttt aga aac cgc ggg tgg) as primers. The resulting 
0.4 kb fragment , coding for the N-terminal fragment of 
C.elegans Unc 53 was cloned in vector pCR2 . 1 (original 
5 TA cloning kit, Invitrogen) , resulting in plasmid 

pTA1718. The 0.4 kb fragment was isolated as a Bglll- 
SacII fragment and cloned in pEGFP-Cl (Clonetech) 
digested with the same enzymes. The resulting plasmid 
was designated pEGFPsac (Figure 29) . pEGFPsac encodes 
10 the N-terminal 13 aa of C.e.Unc53 in fusion with GFP. 

b) Construction of a GFP-C.e. Unc53 full length 

fusion; 

Plasmid pTB72 (shown in Figure 1) was 
digested with restriction enzymes SacII and Apal . The 

15 resulting 4.5 kb cDNA fragment, encoding for the C- 
terminal fragment of C.elegans Unc53 was cloned in 
plasmid pEGFPsac (Figure 29) , digested with the same 
enzymes, resulting in plasmid pEFP72 (Figure 30). 
Plasmid pEGFP encodes GFP in fusion with the full 

20 length C.e. Unc53. 

c) Construction of N-terminal deletions of GFP- 
C.elegans UNC-53 fusion protein, other than pEGFPsac; 

pEGFP72 was digested with Smal. The 
resulting 7.0 kb fragment was religated and 

25 transformed in E.coli, resulting in plasmid pEGFPsma 
(Figure 31) . This plasmid codes for the first 760 aa 
of the Ce-UNC-53 in fusion GFP. 

pEGFP72 was digested with restriction enzymes 
Ecll36II and Smal, the resulting plasmid after 

30 ligation and transformation in E.coli of the 6.7 kb 
fragment was designated pEGFPecl (Figure 32) . This 
plasmid codes for the N-terminal 670 aa of the C.e. 
Unc53 in fusion with GFP. pEGFP72 was further 
digested with Smal and Xbal. The latter site was made 

35 blunt with Klenow polymerase. The resulting fragment 
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of 5.4 kb was religated and transformed in E.coli. 
The resulting plasmid was designated pEGFxba (Figure 
33) . This plasmid codes for the N-terminal 256 aa of 
C.elegans Unc53 in fusion with GFP. 

2. Constructing a hul-UNC-53-GFP fusion, and a 
deletion derivative 

The 5.4 kb hul-unc53 fragment was isolated as 
Clal-Xbal fragment from pLMl (Figure 54) , and cloned 
in pEGFP-Cl digested with AccI and Xbal. pEGFP-Cl was 
isolated from E.coli GM41 (Hfa H, dam-3, thi-1, rel- 
1) . This makes the Xbal restriction site available 
for restriction digest. The resulting plasmid was 
designated pLM4 (Figure 34) . 

3. Visualisation of GFP fluorescence in N4 

cells 

N4 neuroblastoma lines where seeded in Lab Tek 
chambered coverglass (Nalge Nunc International) and 
transfected using lipof ectAMINE (GibcoBRL) . After 18 
hours , the chambered coverglasses where placed on a 
inverted microscope, and GFP fluorescence could be 
visulalised. 

4. Staining GFP fusion expressing cells with 
antibodies 

Transfection with the GFP fusion constructed was 
also performed on coverglasses in a 6-well plate. 
After paraformaldehyde or methanol-acetone fixation, 
cells could be stained for actin cytoskeleton with 
TRITC-phalloidine, for hu-unc53 with sera 28.1 and for 
tubuline with YL1/2 antibody. Visualisation was then 
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performed on a axioplan (Zeiss microscope) . 

Methods of Producing and Qfrserving the Effects ol 
A Chimeric mic-53 gene 

1. Definition of a promoter region in the unc- 
53 C.elegans gene: 

The genomic region from the position 15621 to 
18415 in the C.elegans unc-53 gene, called promoter A, 
was cloned and fused to the cDNA of the GFP gene 
(clone pA/GFP, or pNP10)(cf. fig. 51). This construct 
is injected into wild type worms (N2) . Transgenic 
line express GFP in different neurones: the two pairs 
of pioneering neurones PVP and PVQ, both BDU neurones, 
both ALN and PLN neurones, both PDE neurones, both PVM 
neurones, and 4 vulval cells. Expression begins in 
early embryogenesis , when the axons of those neurones 
grow out. 

2. Mutant Phenotype in Unc-53(nl52) alleles: 

In wild type worms (N2) , the two pairs of ALN and 
PLN neurones each send an axon in a straight line 
longitudinally from the tail to the head (see 
fig. 50a). In unc-53(nl52) alleles, the axons are 
shorter and often branch in a dorso ventral direction 
(see fig. 50b). The neurones are visualised with the 
construct pA/GFP, injected in unc-53(nl52) worms. 



35 



3. The minigene pA/unc-53 rescues the 
elongation defect of ALN and PLN neurones: 
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The promoter A from the C.elegans unc-53 gene was 
fused to the cDNA of the C.elegans unc-53 gene (clone 
pA/unc53, or pNP9) . This construct was injected in 
unc-53 (nl52) mutant worms, together with the pA/GFP 
construct described above to visualise the ALN and PLN 
neurones. The elongation defects of those neurones in 
the unc-53 mutant are almost completely restored by 
the expression of the unc-53 cDNA express under the 
promoter A (see figs. 50 and 51b) . 

4. Domain swap between the C.elegans and human 
unc-53 gene: 

To test whether the vertebrate and worm members 
of the unc-53 family are functionally equivalent, we 
tested the ability of the human gene to rescue the 
mutant phenotype in the worm. We replaced the 
carboxy terminal predicted nucleotide binding domain 
(NTPase) of the worm protein with the homologius 
fragment of the human 1 gene. 

The clone pA/unc-53 was deleted of the C.elegans 
NTPase domain, from the Hpal site, position 29800 on 
the genomic of unc-53, and replaced by the equivalent 
domain of the human-1 gene (unc-53Hl) (see fig. 51) . 
The resulting clone is named pA/unc-53Hl. When this 
clone is injected to unc-53 (nl52) mutants, the 
transgenic worms show a significant but incomplete 
rescue of the defect in the elongation of the ALN and 
PLN neurones (see fig. 51b) . The axons are longer, 
often elongated until the region of the vulva in a 
straight line, without branching dorsally anymore. 
This result shows that a NTPase region of the human 
unc-53 homologue can functionally replace the NTPase 
region of the C.elegans worm. 
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The degree of rescue was analyzed quantitatively 
and summarized in Figure 51b: 
The four strains compared are: 
wt; un-53 (nl52) ; unc-53 (nl52) , pA/unc-53 ;unc- 
5 53(152) ,pA/unc-53-Hl. 

The various phenotypes observed are brought together 
in three large classes: 

<<wild type>> the axon is straight, unbranched and 
migrates into the head; 
10 «vulva>> the axon is straight, unbranched and stops 
in the vulva region; 

<<mutant>> the axon is short, does not reach the vulva 
region and has collateral branches. 

The figures are indicated as a percentage. The number 
15 of axons observed is indicated in the following 



The data clearly show demonstrate conclude that 
the nematode/human chimera minigene pA/unc-53-Hl 
partly rescues the defects of the axonal migration of 



of function of this domain between man and worm. The 
transgenic lines provide a functional screening assay 
for the motility function of at least part of the 
human UNC-53 gene. 

25 

II. Materials and methods 

1 - Cloning: 
30 a) pAB/GFP (pNP3 - Figure 27) 

The gene of GFP has been amplified by PCR with 
cpn3 oligo-nucleotides 

"acattaagcttcgtacgcttggagggtaccg" and Cpn5 
3 5 "gaaaggatccgtacgataaggtattttgtgtcgg" on the plasmid 
pPD95 . 75 (Figure 59) so as to be inserted at the 5' 



column. 



20 



the ALN and PLN neurones and demonstrate conservation 
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position in fusion into the exon 12 of the unc-53 gene 
at a single restriction site SplI and contains its 
stop codon at 3' plus one polyadenilation site. The 
PCR amplification product is directed by Hindlll and 
5 BamHI, sites which are contained respectively in the 
cpn3 and cpn5 oligonucleotides and sub-cloned in the 
pBS vector (clone pNP2) • The GFP is then excised from 
the pNP2 clone at the site SplI and integrated into 
the X16 clone (Figure 60) originating from sub-cloning 
10 of the lambda phage S4 digested by Xhol. The X16 clone 
containing the genomic sequence of unc-53 from the 
position 16621 to the position 24891 cloned in the 
site Xhol of pBS. 

b) pAB/unc-53 (pNP8 - Figure 35) 

15 

The promoter region AB of the X16 clone (between 
PstI and SplI) has been inserted in the clone pTB115 
(Figure 58) in which the region between the sites PstI 
and SplI, containing the promoter of the gene mec-7 
20 and the start of the gene unc-53, has been removed. 

c) pA/GFP (pNPIO - Figure 56) 

The promoter region A has come from the X16 clone 
25 between the sites PstI and Nhel and integrated in the 
vector pPD95.75 containing the GFP in the sites PstI 
and Xbal . 

d) pA/unc-53 (pNP9 - Figure 44) 

30 

The promoter region A has come from the X16 clone 
between the sites PstI and BstXI and is integrated 
into the clone pTB115 in which the region between the 
sites PstI and BstXI, containing the promoter of the 
35 gene mec-7 and the start of the gene unc-53 , has been 
removed . 

e) pA/unc-53 -HI (pCBSOl - Figure 57) 
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The clone pA/unc-53 (pNP9) has been deleted from 
the region 3 1 of the gene unc-53 of the nematode 
between the sites Hpal and Ncol . The 3 1 region of the 
Hlunc-53 gene has been amplified by. PCR with the 
5 oligonucleotides U4Afw (5 • -gca-cat-cgt-taa-cgg-gga- 
ctt-gaa-gc-3 1 ) and Urv ( 5 1 -caa-aag-tct-cta-gag-gcc- 
agt-3 • ) and digested with Hpal and Xbal. After a 
filling stage with T4 polymerase, the ligation is 
effected with a complete end. 

10 

2-Injection 

Conventional injection techniques are used (Fire A., 
1986, Mello G, et al, 1991, journal Mello G. and Fire 

15 A., 1995). Young hermaphrodite adults are injected in 
their two syncytial gonads. The DNA used is prepared 
in standard manner (Qiagen) followed by precipitation 
with lithium chloride. After an extensive rinsing 
stage to eliminate all the salts, the DNA is 

20 resuspended in water. The injection solution contains 
the different DNAs at a concentration of 100 ng/il in 
an injection buffer:. The plasmid pRF4 containing the 
dominant allele su 1006 of the gene rol-6 (Kramer J. 
et al, 1990, Mello C. et al, 1991) is used as a 

25 transformation co-marker. The descendants of roller 

phenotype of the hermaphrodite injected are isolated. 
Approximately 10 % of these transf ormants will yield a 
stable strain, in which the different DNAs injected 
are associated to form a mini-chromosome which will 

30 segregate as unstable extrachromosomal arrays. All the 
transgenic strains obtained were tested by PCR for the 
presence of the DNA injected, using a specific primer 
of the vector and a primer in the gene (results not 
shown) . 

35 

3 . Microscopy 
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The nematodes are observed under a ZEISS Axioplan 
microscope provided with Nomarski lenses, with 40X 
Neofluar, 63X Plan-Apochromat , 100X Plan-Apochromat 
objective lenses. For fluorescence observation the 
luminous source is a mercury bulb. Different ZEISS 
filters are used: 

- for observation under GFP fluorescence, FITC filter: 
blue excitation line at 588 nm, emission through a 
515-565 nm band-pass filter; 

- for observation of the antibody labelling with a 
secondary antibody coupled to the TRITC: 
excitation through a 546 nm band-pass filter, emission 
through a 590 nm long-pass filter. 

The image acquisition is effected by means of a CCD 
camera and and NIH image program using a Machintosh 
computer. The images are processed using the Adobe 
Photoshop program. 



35 
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Sequence Listing 

The following sequences are referred to in the 
specification: 

5 

Sequence TP No l is an amino acid sequence of 
human homologue 1 of UNC-53 protein illustrated in 
Figure 9b. 

Sequence ID No 2 is an amino acid sequence of 
10 human homologue 2 of UNC-53 protein illustrated in 
figure lid. 

Sequence ID No 3 is a nucleic acid sequence of 
the hu-l-unc-53 gene illustrated in Figure 9b. 

Se quence ID No 4 is a nucleic acid sequence of 
15 the hu-2-unc-53 gene illustrated in Figure lid. 

Sequence ID No 5 is a nucleotide sequence of 
Phage Lamda Clone 3b deposited under Accession No LMBP 
3595 . illustrated in Figure 9. 

Sequence ID No 6 is a nucleotide sequence of 
20 plasmid pLMl deposited under Accession No LMBP 3762 
and illustrated in fig 54. 

Sequence ID No 7 is a nucleotide sequence of 
plasmid pLM4 deposited under Accession No 3763 and 
illustrated in fig 34. 
25 Sequen ce ID No 8 is a nucleotide sequence of 

plasmid pEGFP72 deposited under LMBP Accession No 3764 
and illustrated in fig 30. 

Sequence ID No 9 is a nucleotide sequence of 
plasmid pCB501 deposited under Accession No 3765 and 
30 illustrated in fig 57. 

Sequence ID No 10 is a nucleotide sequence of 
plasmid pCB201 deposited under Accession No. LMBP 
3594. 



WO 98/24810 g8 PCT/EP97/069S6 



G A TAfC TCC AGAATTCGGC TTCT TTGAGCAAGTTC ASCC TGG TTAAG 7CC AAGCTGAAT 7CCGGGGAAAGCCGAGCCGGA TC CC rCGACGACCC TATGC 3 
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CTATAGACGTCTTAAGCCGAAGAAACTCGTTCAAGTCGGACCAATTCAGGTrCGACrTAAGGCCCCTTTCGGCTCGGCCTAGGGAGCTCCrGGGATACG- 

3 % 



-pCR2.1 linker 



■lambda gt 10 primer EcoRI 1 suspect sequence linker? pHHl4-3 — 



I S A E F G F F £ Q V C P G VQAEFRGKPSR I P R 3 P y a 

GAqGrCAAGC CGCTCAGCAAGGCGCCTGAAGCGaCCGTGAGCGAAGATGGCAAATCGGACGACGAGCTGCTCTCCAGCAAGGCCAAGGCGCAAAAGAGCr 
CrcCAGTTCGGCGAGrCGTrCCGCGGACTTCGCCGGCACTCGCTTCTACCGTTTAGCCTGCrGCTCGACGAGAGGTCGTTCCGGTTCCGCGTTTTCTCGA 



-pHH14-3 



£V<PLSKAPEAAVSeOGKSOOELLSSlCAKAQ»CS 

C7GGGCC rGTCCCCTCTGCCAAGGGCCAGGAGGAGCGCGCCTTCCTCAAGGrGGACCCCGAGCTGGTGGTGACCGTGCTGGGAGACC TGGAGCAGCTGC7 
GACCCGGACAGGGGAGACGGTTCCCGGTCCTCCTCGCGCGGAAGGAGTTCCACCTGGGGCTCGACCACCACTGGCACGACCCTCTGGACCTCGTCGACGA 



-pHH14-3 



SGPVPSA<GQ££RAFLKVOPeLVVTVLGDLcQLL 

CT'CAGCCAGATGCTGGACCCAGAGTCCCAGAGAAAGAGGACAGTGCAGAArGTCCTGGATC TCCGGCAGAACCTGGAAGAGACC ATG'CCAGCCTGCGA 

1 ' « ' ' ' > ' 1 * UCO 

GAAGrCGGTCTACGACCTGGGTCTCAGGGTCrCTTTCTCCTGTCACGTCTTACAGGACCTAGAGGCCGTCrTGGACCTTCTCTGGrACAGGTCGGACGCT 



-pHH14-3 



- ORF (1-573hp,i » pLM? OHF 



■ full available OR? HtMJnc53/1 s pLMI OR 



PS OH L 0 PESQ?KRTVONVLOLRQNLEETMSSLR 

GG3 7CCC AGGTG ACTCACAGC TCCC 7G3 AGA~ 3-*"C T3C TACGACAGCGA T3ATGCC AACCC ACCCAGCGfGTCCAGCC 7CTCCAACC'3CTCGrCCCC7C 
CC: AGGGTCCAC7GAG rGTIGAGGG ACC TCT-C'G'jACoATGC TGTCGCT AC TACGC 7 fGGG fGCG7CGCACAGGTCGGAGAGG7 7GGCG A3C AGGGG-I 



-PHH14-3 



Ml available ORF HU-Unc52/l = pL.Vh OR 



- S 0 V r w 3 3 L £ " C V0SO0ANPRSVS3LSNP53P 

"G "*C A TGGCGC ^ ATG3CCA3"CC A3 "CC 3CGC t " jC A 33C TGG TGACGCGCCCTC T37G3GT 3GGAGC7GCCGC 7CGGAGGGGACGCCCGC C 73GT ACA " 
-:i:TACCGCGA"ACC3GT:-GGT:AG3CGC: :-:G: ICG ACCACTGCGCG3G AG AC ACCC^CCCTCGACGGCGAGCCTCCCC rGCGGGC32ACCATGTJ 



•pHHM-3 



- OP-.- i 1 .S79-Q?-. c pLMV C?.= 



- I a»<aHacl* CRF HU Unc53/1 a p:_Mi OR 



'* * '3 : i a : . : A G C A P S V 0 G > C P 5 E 3 T P ^ ' J < 



WO 98/24810 9g PCTVEP97/06956 



Tuesday. 18 November 1997 10:33 ^ ^ ^ Page 1 

fie Hu-UncS3/1 seg (1 >6013) Site and Sequence 

GCaCGGCGAACGGGCCCAC TACTCCCACACCATGCCCAT GCGCAGCCCCAGCAAGCTCAGCCATATC TC CCGCCTGGAGC TGGTCGAATC CC TGGAC7C3 

CG'GCCGCTTGCCCGGGTGATGAGGGrGTGGTACGGGTACGCGTCGGGGTCGTTCGAGrCGGrATAGAGGGCGGACCTCGACCAGCTTAGuGACCTGAG: 



-pHHl4-3 



- ORF { 1 -573 bp; = pLM7 ORF 



-full available ORF HU-Unc53/1 a pLM1 OR 



HGERAHYSHTMPMR SPSKLSH ISRLELVESLD3 
GATGAGGTGGACCTCAAGTCCGGCTACATGAGCGACAGTGACCTCATGGGCAAGACCATGACGGAGGATGATGACATCAC TACCGGCTGGGATGAAAGCA 

■ 1 ■ i 1 ■ i 1 ■ « ■ ' ■ i eco 

CTACTCCACCTGGAGrTCAGGCCGATGTACTCGCTGTCACTGGAGTACCCGTTCTGGTACTGCCTCCTACTAC TGTAGTGATGGCCGACCC TAC TTTCGT 



»pHH14-3 



-ORF (1-57300) a pL«M7 ORF 



-full available ORF HU-UncS3/l = pLMI OR 



CevOLKSGYMSOSOLMGKTMTEODOITTGVDES 

GCTCCATCAGTAGTGGACTCAGCGATGCCTCAGACAATC TCAGTTCAGAAGAATTCAATGCCAGCTCCTCACTCAACTCCCTCCCAAGTACTCCCACTGC 

* i ■ t ■ ■ i ■ ■ i » ' i ■ SCO 

CGAGG TAGTCATCACCTGAGTCGCTACGGAGTCTGTTAGAGTCAAG TC TTC TTAAGTTACGG TCGAGGAGTGAGTTGAGGGAGGGTTCATGAGGGTGAC j 



•pHHIj^ 



■pC8212 



ORF ;t-570boi a pLM7 ORF 



lull available ORF HU-Unc53/l a pLMI OR 



SSfSSGL S0AS-3NLSSEEFNASSSLNSLPSTPTA 

" T C TCGC AGGAACTC AACA ATAG TGC TACGCAC AGAC TC AG AGAAGCGCTC AC TGGCAGAAAGTGGGCTGAGC TGGTTTaGTGAA TCAGaGGaGaAAGC Z 
-A3 A3CGTCC TTGAGfTGTTATC ACGATGCGTGTCTG AGTCTCT TCGCGACfGACCGTCTTTCACCCGAC TCGACCAAATCAC HACK TCC TCTT rCG3 



•pHHl4-3 



■pOB212 



-full available ORF HU-Unc53/l = pLM1 OR 



3^RN5T:VLR r C3EKR3LAESGLSVFSE3££V A 

CC "AAAAAAC TGGAGTACGACAG TGG TAGCCT GaA jA "GGAACC TGGG AC TTC TAAG TGGCGGAGGGAGCGGCCTGAGAGC TGTG ATGA7TC ATCC AAG3 
jGaV TTT TTGACC TC aTGC TGTC ACC aTCGGAC" "C7 ACCTTGGACCC TGAAGATTCACCGCCTCCCTCGCCGGAC TC TCGAC AC rACTAAGTAGGTTc: 



-pHH14-3 



-PC8212 



:~i a-/3ilatldORF HUUnc53/1 = pLMI OR 



V 3 \i O 



E P 3 rSKVR^eP'E : C D 0 



WO 98/24810 1 qq PCT/EP97/06956 



Tuesday. 18 November 1997 10:33 9 PageJ 
fig Hu-UncS 3/1 seq (1 > 6013) Site artd Sequence J 

0 "5GAGAAC TGAAAAAGCCCATCAGCCTGGGCCACCC TGGTTC CCTGAAGAAGGGCAAGACCCC ACC TG TGGC TGTAACT TCCCCCA TCAC TCACACAGC 
Ci:crCT:GACTTTTTCGGGTAGTCGGACCCGGTGGGACCAAGGGACrTCrrCCCGTTCTGGGGTGGACACCGACATrGAAGGGGGTAGTGAGTGTGrC2 



-pHHU-3 



-pCB212 



-full available ORF HU-Unc53/1 =» pLMl OR 



G G E L K K P I SLGHPGSLKKGKTPPVAVTSP I T H T A 
CCAGAGTGCCCTCAAAGTCGCAGGCAAACCTGAGGGCAAAGCTACAGACAAGGGTAAGCTTGCAGTGAAGAATACTGGGCTCCAACGCTCCTCCTCTGAT 

' 1 ' 1 ' 1 1 ' » < ico: 

G'jTCTCACGGGAGTTTCAGCGTCCGTTTGGACTCCC3TTrCGATGTCTGTTCCCATTCGAACGTCACTTCrTArGACCCGAGGTTGCGAGGAGGAGACTA 



•pHHU-3 



■pCB212 



■full available ORF HU-UncS3/1 a pLMl OR 



OSALKVAGKPEGKA tdkgklavkntglors sso 
GCTGGTCGGGACCGCCTGAGTGATGCTAAGAAGCCCCCCTCGGGCATTGCTCGCCCCTCCACTTCGGGATCCTTTGGCTACAAGAAGCCTCCTCCTGCCA 

. 1 . iii • \ 1 mo: 

CGACCAGCCCTGGCGGACTCACTACGATTCTTCGGGGGGAGCCCGTAACGAGCGGGGAGGTGAAGCCCTAGGAAACCGATGrrCTTCGGAGGAGGACGG- 



•pHH14-3 



-pCB2l2 



full available ORF HU-Unc53/1 a pLM1 OR 



AGPORLSOAKKP^SG 1 ARPS TSGSFGYKKPPPA 

C A jGC AC AGCCACTGTCATGC AAAC TGG TGGTTCAGCC ACTC TCAGCAAGATCCAG AAGTCCTCAGGCATCCC TGTCAAGCC AGTAAATGGGCGCAAGAC 
G "CCG TGTCGGTGACAGTACGTTTGACC ACCAAGTCG3TGAGAGTCGTTCTAGGTCT TCAGGAGTCCGTAGGGACAGTTCGGTCATTTACCCGCGTTC T2 



-pHHH-3 



-PC8212 



-full available ORF HU-Unc53/1 spLM1 OR 



"3TArVMQrGGSArLSKI0KSSGIPVKPVN(iRl> r 

"c3C r TA3ATGTTTCC AAC AG TGC AGAGCC AQGA T7CZ" 3GC TCC TGG AGCCCGT TC T AAC A TCCAG TA CCGC AGCC TGCCCCGGCC AGCC A AG 7C AAQ" 
« "ZGAATC 7ACAAAG3 r TG rCACG TC TCGG TCC 7AAc 3~CCGAGGACC TCGGGCAAGA TTGT AGG TCA7G3CG fCGGACGGGGCCGGTCGG T TC AG T7CA 



-pHH14-3 



-PCB212 



-lull a, stable ORF HU-Unc53/t spLMl OR 



: L 0 V 3 Jl 3 A S P G F A P G A R 5 M I 0 Y f? 3 L P R P A 



WO 98/24810 1 q 1 PCT/EP97/06956 



Tuesday. 18 November 1997 10:33 fct $ 9 p a ^ 

fig Hu-Unc53/1 seq (1 > 6013) S>te and Sequence 

rc r A TGAGCG TGACCGGCGGGCGGGG TGGACC TCGCCC TGTGAGC AGC AGC ATTGACCCC AG TC TCC TC AGCACC AAGC AGGG AGGCC7 TACGCC T rcc A 
AGATACTCGCAC TGGCCGCCCGCCCCACCTGGAGCGGGACAC TCGTCGTCG fAACTGGGGTC AGAGGAGTCGTGGTTCGTCCC TCCGGAATGCGGAAGG" 



-pHHl4-3 



-pC82l2 



-pCB21GM4 



-full available ORF HU«Unc53/1 = pLMI OR 



\ rev primer HU53rv4 — ^ 1 

S M S V T G G R G G P R P V SSSIOPSLLSTKQGGUTPS 

GaC TGAAGGAGCCTACC A AGGTAGCCAG TGGGCGGACCACTCCAGCCCCTGTCAATCAGACAGATCGGGAAAAGGAGAAGGCCAAAGCCAAGGCAG tgg: 

1 > ' 1 1 ' 1 1 I ' i ' ' ' I ■ i i r : A'V 

CTGACrTCCTCGGATGGTTCCATCGGTCACCCGCCTGGTGAGGTCGGGGACAGTTAGTCTGTCTAGCCCTTTTCCTCTTCCGGTTTCGGTTCCGTCACC3 





-pCB212 ' 



-PHH14-3 



■DCB210-14 



-lull available ORF HU-Unc53/1 = pLM1 OR 



p t KgPTK VAS GRTTPAP VNOTDREKEKAKAKAVA 

CTTGGACTCAGACAACATCTCCTTGAAGA GTATTGGC TCCCC AGAAAG TAC TCCC AAGAACC AAGC AAGCCACCCC AC AGCCACC AAGC TGGCAGAGC T'J 
GAACCTGAGTCTGTTGrAGAGGAACTTC TCATAACCG AGGGGTCTTTC ATG AGGGTTCTTGGTTCGTTCGGTGGGGTGTCGGTGGTTCGACCGrCTCGAl* 



-pHH14-3 



■PCB210-14 



I available ORF HU*Unc53/t = pLMI OR 



I 0 S 0 N I SLXS SG5PESTPKN0A3HPTATKL4EL 

CCACCAACCCCTCrCAGGGCCACAGCGAAGAGCTrrGTCAAACCACCCTCACTAGCCAATCrfGACAAGG rCAACTCCAACAGTC rGGATC TACCA TCA" 
GGrGGrTGGGGAGAGTCCCGGTGTCGCrTCTCGAAACAGTTTGGTGGGAGTGATCGGTTAGAACTGrTCCAGTTGAGGTrGTCAGACCriGArGGTAGTA 

— PHH14-3 — 

PCB210-14 — 



" lull a.a:iabfe ORF HU-Unc53/{ = pL.VII OR 

O3 TPLRATAK3F vy:PPSLA>4L0.<VNSNSL0LP5 



WO 98/24810 1 Q2 PCT/EP97/06956 



Tuesday. 18 November 1997 10:33 £7, a 9 P* 
fie, .Hu-Unc53/1 seq (1 >6013) SUe and Sequence ' 

CCaG T3A FACCACCCATGC TTCAAAGG T"CCCAGaTCTGC ATGCTAC AAGC TXAGCATC TGGGGGCCC TC fCCC TTCC TGC TTC ACCCCCAG TCCGGCAC" 

GGTCACTATGGTGGGTACGAAGTrTCCAGGGTCTAGACGrACGATGTTCGAGTCGTAGACCCCCGGGAGAGGGAAGGACGAAGrGGGGGTCAGGCCGTGu " 



pHH14-3 , 

PCB21CH4 - 

fuU a w 3 ;i3Ki^ nac Hi »-' jncsSTi - r ' Ml 00 
SSOTTHASKVPDLHATSSASGGPLPSCFTPSPAP 



CATCCTCAATATTAAC fCAGCCAGCTTC TCCCA3GGCCTGGAGCTAATGAGTGGTTTCAGTGTGCCAAAAGAGACCCGCATGTACCCCAAAC TC TCAGGC 

i 1 ' ! 1 1 » 1 » ■ ' ■ ■ ■ T 220' 

GrAGGAGTTATAATTGA3TCGGTCGAAGAGGGT:CCGGACCTCGATTACTCACCAAAGTCACACGGTTTTCrCTGGGCGTACATGGGGTTTGAGAGTCC3 



-pHH14-3 
pggglO-14 



full available ORF HU-Unc53/1 3 plM1 OR ■ — - — 

I L N I N S A S F S Q G L ELMSGFSVPKE TRMYPtCLSG 

CTGCACAGGAGCATGGAGTCCCTCCAGATGCCAArGAGCCTCCCCAGTGCCTTCCCCAGCAGTACTCCCGTCCCCACCCCACCTGCTCCCCCTGCTGCTC 

* ' ' ' I ' ' ' I ' I I I ■ ) ! 1 , ) ? 

GACGTGTCCTCGTACCTCAGGGAGGTCTACGGTTACTCGGAGGGGTCACGGAAGGGGTCGTCATGAGGGCAGGGGTGGGGTGGACGAGGGGGACGACGAj 

— PHH14-3 



pC82HM4 — 

-?ull available ORF HU-UncS3/l = pLM1 OR — 

HRSMSSLCMPPSLPSAFPSSTPVPTPPAPPAA 



CCACAGAAGAAGAGACGGAAGAGCTGACTTGGi2 r GGAAGCCCCAGAGCTGGGCAACTGGACAGTAATCAGCGGGATCGGAACAC TCTTCCC AAGAAAGG 

» — ■ » ■ i ■ 1 1 i 1 i . t - •■■ 

GGTGTCTTCTTCTCTGCCTTCTCGACTGAACCTCACCTTCGGGGTCTCGACCCGTTGACCTGrCATTAGTCGCCCTAGCCTrGTGAGAAGGGTTCTTTC: 



•pHHH-3 



-pHH3b 



-pCB210-14 



Lll available ORF HU-Unc53/l = pLMl OR 



I rev p.'imer Hl*S?--v3 jj £ — p^*, HUHivZ 



•peptide B?2623H 



P "cEET-IH^ r v = 3SPt?AGQL03N0RDRNTLPKK'G 



WO 98/24810 1Q3 PCT/EP97/06956 



Tuesday. 1 8 November 1997 10:33 J Page £ 

fig Hu-Unc53/1 seg (1 > 6013) Site and Sequence 

GC7CAGGTACCAGCTTCAGTCCCAGGAGGAGACCAAGGAGAGGCGACATTCCCAT ACCArrGGrGGGCTGCCTGAATCCGArGACCAGTCAGAGCTGCC7 
CGAGrCCATGGTCGAAGTCAGGGTCCTCCTCTGGTTCCrCTCCGCTGTAAGGGTATGGTAACCACCCGACGGACTTAGGCTACTGGTCAGTCTCGACGGA 



-pHHU-3 



■pHH3b 



-full available ORF HlMJncS3/1 a pLMl OR 



L R Y Q L O S Q £ E TKERRHSHT IGGLPESDDQSELP 



TCrcCCCCTGCACTTCCCArGTCTCTGAGTGCAAAGGGCCAACTTA CCAACATAGTGAGTCCCACTGCGGCCACCACGCCAAGAATCACCCGCrCCAACH 
AGAGGGGGACGTGAAGGGTACAGAGACTCACGTTTCCCGGTTGAATGGTTGTATCACTCAGGGTGACGCCGGTGGTGCGGTTCTTAGTGGGCGAGGTTGT ' " 



-pHHU-3 



-pHH3b 



-full available ORF HU-Unc53/1 = pLM1 OR 



SP^ ALPMS LS A<G QLTN! V SP TAATTPR I 1* ft $ f-j 

CCATCCCCACCCACGAGGCGGCCTTCGAGCTGTA C AGCGGCTCCCAAATGGGGAGCACCC TGTCCC TGGCCGAGAGACCCAAGGGAATGAt TCGGTCAGG 

' lit j i | , t , t | ( , , , t ^ . 

CGTAGGGGTGGGTGCTCCGCCGGAAGCTCGACATGTCGCCGAGGGTTTACCCCTCGTGGGACAGGGACCGGCTCTCTGGGTTCCCTTACTAAGCCAGTCC 



-pHH14-3 



-pHH3b 



•full available ORF HU«Unc53/1 s pLMl OR 



3 I PTH EAAFEL VSGSQMGSTLSLA£RPKGMIRS 



Mr:cTrCCGAGACCCCACGGACGATG7TCACGGCTCA:T3CTGrcCCTGGCCTCCAGrGCCrCCTCCACC TAC TCC rCAGCTGAGGAGAGGATGCAATC" 
TAGGAAGGCTCTGGGGTGCCTGCTACAAGTGCCGAGrCACGACAGGGACCGGAGGTCACGGAGGAGGTGGATGACGAGTCGACTCCTCTCCTACGTTA'jA 

H, I 

— pHHM-3 — 1 I 



— pHH3b 

full available ORF HU-Unc53/l = pLMl OR 

R DPT0DVHG5 VL3LASSASSTY53AEERMQ3 



G^GCAAAfCCGGAAGC TTCGTAGGGAAC fGGAATC AfCCZAG3AAAAAGTGGCCACC T T5ACGTCTC AGCTTTC TGCCAATGC TAATCTGGTGGCTGCT T 
C "CG rrr^GGCC 7TCGAAGCATCCC TTGACCTTAGTAGo jTCCTTTTTCACCGGTGGAAC tgcagagtcgaaagacggttacg at TAGACC ACCGACGAA 



UXOR?* «pC83*1 OFir 



pHH3b 

— . r -:i a*3:'atie ORF HU-Unc53/1 a pLM1 OR 

' I R < t R P E L £ S S OEXVATL TSCL S-M A N L V A A 



WO 98/24810 t0k PCT/EP97/06956 

104 



ft ? 

Tuesday. 18 November 1997 10:33 w P 7 

1I2. Hu-Unc53/1 seq (1 > 6013) Site and Sequence _ 399 ' 

7 TGAGCAGAGCC TGGTGAA FATGACATCCCCCCTGCSAC ACCTGGCAGAGACGGCCGAGGAGAACGACAC 7GAGC TGC 7GGA777GCGAGAAACCArAGA 

A AC TCGTC TCGGACCAC TT ATAC TG 7 AGGGCGGAC GC TG 7GGACCG TC TC TGCCGGC TCC TC TTCC TG7GACTCGACCACC TAAACGCTC 77 TGG T A7CT ^ 



-U2 OR? = pCS2St ORF 



-pHH3b 



-full available ORF HU-Unc53/1 =: pLM* OR 



Fg QSLV/N MTS RLR H LAETAEEKDTELLDLRETID 

CTT7CTGAAGAAAAAGAACTCTGAGGCCCAGGCAGTCATTCAGGGAGCCCTTAATGCCTCAGAAACCACACCCAAAGAA CTTCGGATCAAGAGACAAAAC 
GAAAGACTTCTTTTTCTTGAGAC TCCGGG7CCG7C A3 TAAGTCCCTCGGGAATTACGGAG TCTTTGGTGTGGGTTTCTTGAAGCC 7AG77CTCTGTTT7G 3 '~'" 



- U2 ORF n pC82Sl ORr 



-pHH3b 



-full available ORF HlMJnc53/1 » pLMl OR 



F L K K K N S E A 0 A V I Q GALNASETTPKEUR IK RON 

TCCTCAGATAGCATCTCAAGCCTCAACAGCATCACTA3CCATTCCAGCATCGGCAGCAG CAAGGATGCTGATGCGAAAAAGAAGAAAAAAAAGAGTTGGG 
AGGAGTC TATCGTAGAGTTCGGAGTTGrCGTAG'GATCGGTAAGGTCGTAGCCGTCGTCGTTCCTACGACrACGCTTTTTCrTCTTTTTTrTCTCAACCC 



-U2 ORF » pCtt25i ORF 



-pHH3b 



full available ORF HU-UncS3/l = pLMl OR 



SSO StSSLNS fTS H SS IGSSXOAOAKKXKKKSV 



tC * ATGAGC T 7CGAAG 7TCC7TC AACAAAGCG 77C AG T A 7AAAAAAGGGGCCC AAGTCAGC77CCTC A7AC 7CGGA7A7AGAGGAGAT7GC 7ACACCCGA 

" -» 1 ■ 1 i ' i i — i ■ i y-v 

AGArACTCGAAGCrrCAAGGAAGTTGTTTCGCAAGTCATATTTTTTCCCCGGGTTCAGTCGAAGGAGTATGAGCCTATATCTCCTCTAACGATGTGGGC- 



-U2 0*F = pC22:i: OR? 



-pHH3b 



-full available CRF HU-UncS3/i =pLMi OR 



v Y £ L R S S r N .< A = S IKKGPKSASSYSOISEIATFG 

Crc ■ TCAGCCCCCrCArcCCCCAAAC7ACAGC^"-:G y ':TACAGA3AC7GCTTC ACCCTCCArCAAGT:crCCACCT7G7CCrCCGTG>:o;ACTGATGT: 
ll^.i G ^ G T TT ; 3 [ A ^ T CG'-:C-A5ATG TC 7C 7GAC 3AAG7GGGAGG T AG 7TCAGGAGG 7GGAACAGGAGGCACCC3 TGAC7 ACAli 

*~ — ^iFoRF*T pc g i OR? • 

■ pHH3b _____ 



— :u!l a . ailable OR? HU-Unc53/T = pLMl OR 

*2aP3SPXlQ.hG STETASPSlKSSTuSSVSrD 



WO 98/24810 4nc PCT/EP97/06956 



Tuesday. 18 November 1997 10:33 ^ t 3 ^ Page^ 
fir '-( u-UncS3/1 seq (1 > 6013) Site and Sequence 

ACCC .GCCCTGCTCACCCAGCCCCCCACACTAGGCrGTTCCATGCAAATGAGGAGGAGGAGCCAGAGAAGAAGGAGGrArCGGAuCraCGCTCTG^C 
TGGCrCCCGGGACGAGTGGGTCGGGGGGTGTGATCCGACAAGGTACGTrTACTCCTCCTCCTCGGTCTCTTCTrcCTCCATAGCCTCGACGCGAGACTCc 



-U2 ORF = pCB2S! ORF 



•pHH3b 



-,'ull available ORr HU-UncS3/1 = pLMl OR 



TeGPAHPAPHTRLFHANEEEEPEKKEVSELRSE 
TATGGGAGAAGGAAATGAAGCTTACAGACATCCGCTTGGAGGCCCTCAACTCTGCCCACCAACTGGATCAGCTTCGGGAGACCATGCACAACATGCAGT* 

1 i .i 1 ■ 1 1 1 i ■ ■ :>* : >: 

ATACCCTCTTCCTTTACTTCGAArGTCTGTAGGCGAACCTCCGGGAGTTGAGACGGGTGGTTGACCTAGTCGAAGCCCTCTGGTACGTGTTGTACGTCA- 



-U2CR?" « pCEWSi ORS 



-pHH3b 



•Duolide 872627H 



-lull available ORF HlMlnc53/1 a pLM1 OR 



•U3 ORFopLMS ORF 



LWEKEMKLTOIRLEALNSAHQLOQLRETMHNMQL 

GGaGG TGGACC7GC TGAAAGCAG AGaaTGACCGAC T3AAGGTAGCCCCAGGCCCCTCATCAGGC TCCAC TCCAGGGCAGG rcCCTGGATCATCTGCATTA 

' ' ' 1 i ■ ' i : ' 1- 37 

CCTCCACCTGGACGACTTTCGTCTCTTACTGGCTGACTTCCATCGGGGTCCGGGGAGTAGTCCGAGGTGAGGTCCCGTCCAGGGACCTAGrAGACGTAA- 



- ORr =* pCB25l OR? 



•pHH3b 



• rull available ORF HU-Unc53n a pLMl OR 



-U3 ORFspLM5 ORF 



EVOLLKAENOS LKVAPGPSSGSTPGQ VPGSSAL 

TCTTCCCCACGCC3CTCCC TAGGCC TG3CAC7C-CCC AT TCC T7CGGCCCC AG TCTTGC AGACAC AGACC rGTCACCCATGGATGGCArc A3 TACT TGTG 

— t i i i i ■ i V 

AGAAGGGGTGCGGCGAGGGA7CCGG ACCGTGAG'GGG TAAGGAAGCCGGGG TCAGAACG TC TGTGTC TGGACAGTGGGTACC TACCG TAG TC ATGAACAl* 



- U2 ORr = pCotS! ORF 



■ pHH3b 



•{•-■II ava:labie ORF HU-Unc53/1 = oLMI OR 



-U3 ORF = pLMS ORF 



SSPR^SLGLALT-iSFGPSLACTOLSPMOG I 3 T C 



4. 



WO 98/24810 



PCT/EP97/06956 



106 



Tuesday. 18 November 1997 10:34 

ftfr ' u-Unc53/1 seq (1 > 601 3) Site and Sequence 



Page? 



GrcCAAAaGAGGAAGTGACCCTCCGGCTGGTGGTGiioGArGCCCCCGCAGCACATCATCAAAGGGGACrrGAAuCAGCAaGAATTCTTCCTGGGCTa'AG 
CAuGrrTCCTCCTTCACTGGGAGGCCCACCACCACrCCTACGGGGGCGTCGrGTAGTAGrTTCCCCrGAACTTCGTCGTrcrrAAGAAGGACCCGACATC: 



•U2CRF = pCS2S1 ORF 



-pHH3b 



-hill avaUable ORF HU-UncS3/1 s pLMl OR 



-U3 ORF = pLM5 ORF 



GPKEEvTLRVVvRMPPQHf IKGDLKQQEFFLGCS 

CAAGGTCAGTGGAAAA GTTGACTGGAAGATGCTGGATGAAGC rGTTTTCCAAGTGTTCAAGGACTATATTTC TAAAATGGACCCAGCCTCTACCCTGGGA 
G TTCC AGTCACC TTTTCAACTGACCTTCTACGACC TACT TCGACAAAAGGTTCACAAGTTCC TGATAtAAAGA TTTTACC TGGGTCGGAGATGGGACCCT 



-U2 ORF » pC62S l ORF 



■ pHH3b 



■full available ORF HU»Unc53/1 « pLMl OR 



-U3 ORF a pLMS ORF 



KVS GKV OV KnLDE AVF Q V FKDY I SKMOPASTLG 

CTAAGZACTGAGTCCATCCATGGCrACAGCArCAGCCACoTGAAACGAGTGTTGGATGCAGAGCCCCCCGAGATGCCTC CTTGCCGTCGAGGTGTCAATA 
GATTCGTGACTCAGGTAGGTACCGATGTCGTAGTCGGTGCACTTTGCTCACAACCTACGTCTCGGGGGGCTCTACGGAGGAACGGCAGCTCCACAGTTAT 



•U2 0RF = pCSSSi ORF 



-pHH3b 



■U4 ORr = pC5c01 ORF 



■i-jll av3:lafcle ORF HU*Unc53/1 = pLMl OR 



■U3 ORF a pLM5 ORF 



-pHH!5 



- S T £ S 1 H G YSISaVKRVUQAEPPErtPPCPgGVM 
A "ATCAG TC TCCC TCAAAGG'C To AAGGAQ^.:^ TCGACAGCC TGG TG TTCGAGAZ3C TGA "CCCC AA3CCGA TGA TGCAGC AC "UC A TAA3CC 7 

ro'ArAGrcAGAGGGAGTTTCCAGAcrr:cr:'""ir;cAGcrGTCGGACCACAAGC~cr3CGACT^GaG3TTCGGcrA:7A:GrcGT:-r:rA'T:GGA 



•«J2 ORr = pC52^ : ORr 



-pHH3b 



-U4 ORF = pCeZQi ORF 



..l a-a:lati-j ORF HU Unc53/1 = oLMI OR 



- U30RF = pLM5 ORF 



n I ?. V :j L v G . 



PHH1S ■ 

/ U 3 L V F £ T L ? * P f" *■ ■> H I ... l 



WO 98/24810 PCT/EP97/06956 
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Tuesday. 18 November 1997 10:34 f^fj ^ Page 16 
tig < u-UncS3/1 seq (1 > 6013) Site and Sequence 

cc : re i agc accggcgcc tcg rcc tc tccggccccagcggcacgggcaagacctacc tgaccaatcgc r rcGCCGAGT'Acc tgstq gagc gc tc tgg: 

GGACGACTTCGTGGCCGCGGAGCAGGAGAGCCCGGGGTCGCCGTGCCCGTTCTGGATGGACTGGTTAGCGAACCGGCTCATGGACCACCTCGCGAGACCc 



-U2 ORFspCB2S1 ORF 



• pHH3b 



»U4 ORFapCB201 ORF 



-full available ORF HU>Unc53/1 a pLMl OR 



■U3 ORF a pLM5 ORF 



-pHHlS 



LLKHRRLVLS GP5 GTG KTYLT NR LAEYL V c R S G 
CGTGAGGrCACAGAGGGCATCGTCAGCACCTTCAACATGCACCAGCAGTCTTGCAAGGATCTGCAACTG TATC TTTCCAACC TAGCCAACC ACATAGACC 

■ ; ■ ' ■ ■ ' 1 1 1 1 1 1 1 1 1 ! ■ • ■ ' --c»; 

GCACrCCfiGTGTCTCCCGTAGCAGTCGTGGAAGTTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGACArAGAAAGGTTGGATCGGTTGGTCTATCTGG 



-U2 0RFspCB2S? ORF 



-pHH3b 



-U4 ORF = pCB201 ORF 



- full available ORF HU-UncS3/1 ■ pLMl OR 



»U3 ORF a pLMS ORF 



-pHHlS 



-tvTEGiVoTFNMHQOSCKOLQLYLSNLAHQIO 

GOG A A AC AGGAATTGoGGA TGTGCCCC TGGTG A" TC TA T TGGATGACC TGAGTGAAGCAGGC TCCA TCAGTGAGrTGGTC AATGGGGCCC TCACC TGCAA 
'FTGrcC TTAACCCCTACACGGGGACCACTAAGATAACCTACTGGACTCACTTCGTCCGAGGrAGTCACTCAACCAGrrACCCCGaGAGTGGACGTT 



-UJ> ORF «= pCBSSt ORr 



-pHH3b 



♦U4 QRF apCB201 ORF 



-111 available ORF HU-UncS3/1 = pLMl OR 



U3 ORF = pLMS ORF - 

pHH15 

E 

L 0 0 L 3 £ A G 5 i 3 £ L V N G A . f C > 
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Tuesday, 1 8 November 1997 10:34 G p ■ 

fig M u-Unc53/1 seq (1 > 6013) Site and Sequence u 9 

u'arCA. -4A TG rcCCTATATTA TAGGfACCACC AATCAGCC T G TAAAAA f GAC ACCC AAC C A TGGC T TQC AC T TGAGC " TC A3GATGT TG ACC ~1Z~*"Z 

C ATAG TA TTTAC AGGGATA TAATaTCC A TGGTGG T TAGTCGGAC ATTTrTACTGrGGGTrGG TACCGAACG TGAAC TCGAAG FCC TACAAC TGGAAGa*"" " " " 



-U2 GRF = pC3£S1 OR? 



-pHH3b 



-U4 QRF - pCBZOI ORr 



-fuii avaitable ORr HU-L'ncS3/1 = pLM1 OR 



-U3 QRF a pLM5 QRF 



•pHH15 



YHKCPYl IGTTNOPVKMTPNHGUHLSFRMt T F 3 

AACAACGTGGAGCCAGCC^ATGGCTTCCTGGTTCGTrACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCGACATCAATGCCA ACAAGGAAGAGCTGCTT: 
TTGTTGCACCTCGGTCGGTTACCGAAGGACCAAGCAATGGACrCCTCCTTCGACCATCTCAGTCTGTCGCTGTAGTTACGGrTGTTCCTTCTCGACGAAS 



-U2 0RF = pCB55l OR? 



•pHH3b 



-U4 QRF 9 pCB201 ORF 



-?ull available QRF HU»UncS3/1 = pLNM OR 



-U30RF = pLMS ORF 



-pHHlS 



V *NV£PAMGFL VSYLRRKLV£S0S0INANKEELU 

GOGTGC TCG iCTGGGTATC CAAGC TG T3GT A7CA ""C TCC AC ACC T TCC TTG AGAAGC ACAGC ACC TCA3 AC TTCC TCaTCGGCCC TTGC T TCTTTC TGT": 
CCC ACGAGC TGACCC- T GGGTTCGAC ACCA TAG" AGAGG TG TGGAAGG AAC TC TTCG TGTCG TGGAG TC TGAAGGAG'AGCCGGGAACGAAGAAAG AC^H 



-U2 ORF ::: pCB'r'SI ORF 



-pHH3b 



-U4 QRF = pCS201 ORF 



■ >'uil available ORr HU»UncS3/1 = pLMl OR 



- U3 ORF = pLMS ORF 



PHH15 

c V I 0 V v : < L • 1 - , H T F l E K H 3 * 3 C F I G P C ? r L 



WO 98/24810 1og PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 ^ l ^ ' Page it- 
Uq .ij>Unc53/l seq (1 >6013) Site and Sequence U 

G 7'J TCCC ATTGGCATTGAGGACT TCCGGACC 7GG T TC AT TGACC TG TGGAACAAC TC "ATCA fTCCC TA 7C 7ACAGGAAGGAGCC AAGGa T3GG ATAA.-i^ 

c ac'aguG taaccgtaactcctgaaggcc tggaccaagtaactggac acct tgttgagatagt aagggatag atgtcc r rcc rcuGTTCc "accctat" xz 



-U2 OR? a pCBiTSt ORF 



•pHH3b 



-U4 ORF n pCB201 ORF 



-full available ORF HU-UncS3/l « pLMl OR 



-U3 ORFspLMS ORF 



-PHH15 



CPIGIEDFftTUF I 0 L V N N S 1 (PYLQEGAK2GII 

g tccatggacagaaagc tgct tggg aggaccc ag tggaa tgggtcc gggac ac ac t rccc TGGCCATCAGCCC AACAAGACC AATCAAAGC tgtaccacc 
caggtacctgtctttcgacgaaccctcctgggtcaccttacccaggccctgtgtgaagggaccggtagtcgggttgttctggttagtttcgacatgg'gg 



- U2 ORF = pCB2Sl ORF 



»pHH3b 



-U4 ORF = pCB20l ORF 



-full available ORr HU-tJncS3/l =pLMl OR 



-U3 ORF = pLM5 ORF 



•PHH15 



:< A a v £ 0 P V £ v V a 0 T L © V ? S A 3 Q 0 C S X L V M 



"'■i^C:CCACCCACCGT 3GGCCCrCACAGCATTG:CT:ACCTCCCGAGGATAGGACAGTCAAA3ACAGCAC:CCAAGT7CTCTGGACTCAGA r C:'C TGA" 



jGGGGTGGG'GGCAGCCGGGAGTGTCGTAACGGAGTGGAGGGCTCCTArCCTGTCAGTTTCTG'CGTGGGGrrCAAGAGACC TGAG'CTAGG AGAC 



-II?. OR-* pCBl^ \ 0?VT 



-pHH3b 



-U4 ORF = pC8201 OFF 



-lull available ORF HU-UncS3/t = oLMi OR 



• U3 ORF = pLMS ORF 



— pHH15 

L^PPrvGPHS IASPPS0PTVK05TP53LD 
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Tuesday. 18 November 1997 10:34 ^ f >J Pa S© i 3 

lie -lu-UncS3/1 seq (1 >6013) Site and Secuenco _ 

GGCCATGC7GC7GAAAC 7TCAAGAAGC 7GCC AAC TiC 4 r TGAGTC 7CCAG A TCGAGaaaCC- 7CC 7GGACCCC AACCT TC A3 3C A AC AC r r TmAGGGT TI 
CCGG rACGACGACTTTGAAGTTC TTCGACGG7T3ATG7AAC TCAGAGGTC TAGC TC 7 77GGT AGGACC7GGGG T7GGAAC 7CCGT 7G TGAAATTCCCAa3 

=3 ' - 

ij2 ORF = pC3251 ORF —J 



-pHH3b 



•U4 ORF = pCB201 ORF 



•full available ORF HU«Unc53/1 = pLMl OR 



• U3 ORF = pLM5 ORF 



5: 



-PHH15 



-peptic* 37 202*3 hi 



AMLL K L 0 £ AAN f I E S P 0 R E T ILDPNLQATL GF 

GGCAATCACTGTCACCCCCGGACAGCAGAACGCTGGCATCAGCTATCTTAG CTCCTCCTCTCCCCTCTCCTCTrTCAGAGCACTGGCTCTCCAGCCC CAG 
CCGTTAGTGACAGTGGGGGCCTGTCGTCTTGCGACC 3 TAGTCGATAGAATCGAGGAGGAGAGGGGAGAGGAGAAAGTCTCGTGACCGAGAGGTCGGGG TC 



woe 



-pHH3b 



-pHH1S 



GNHCMPRTAER--HOLS . L I L S ? L L F Q S T G 5 P A P 

G- jGAGAAC AGGAGGG A3GAGGAGA7GAAAGAG3 AG33ACAGG7TC TTGG TGCTGTaCC T773AGAAC77CCT AGGAAGGAA fGG TGGGGTGGCGTTTGG 
•:T:CTCTTGTCCTCCCT:crCCTCrACTrTCTC:-C::TGTCCAAGAACCACGACA7GGAAACTCr7GAAGGATCCTTCCTTACCACCCCACCGCAAACC 



•pHH3b 



•pHH15 



G 3 £ 0 i GGGDE^SG TG S VCC TFEHFLGRNGG7 AF G 

G-ACTTG rGCCCCCrAAACACATTTACTGGCCT::-:rAATGACTTTGGGGAAAAGATGATT:TGGGTCTrrCCCTTGACTTCTTGTTTCAA7TACAAA: 
C " TCAACACGGGGuAT T7G7GTAAA 7GACCGGA3CA3 AT TAC TGAAACCCC TT T 7CTAC T AAGACCCAG AAAGGGAAC TGAAGAACAAAG T 7AA 7G 77 Tli 



-pHH3b 



- pHHtS 



r. LCPLNTF 7GLc LVGK005GSFP . L L V S I T N 

'5-GGC 7 7 TC 7333G AG3GG7TC AGAAAAC A* ! -A A A3 AC TGCAGC AG 7 7CC7AAATG AT 7C 7C ACAAGC AACCC'GAGAG AGAC AG 7C 7 73 7GAGG3 

■ ■ ■ ' *■»: ' 

a.- i 3a:ccgaaagacccc7ccccaagtct7ttgta;-t't37Gacgtcg7caaggattta:7aagag7G77:gttgggac7C7C7c tg7cagaacac tcc: 

pHH3b - 1 

■ — pHH!5 — 

A F w 3 G v 0 I r ; • M C 5 3 5 m | S % p E 2 Z* 3 L V R 
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Tuesday. 18 November 1997 10:34 C ? ' 

fig Hu-UncS3/1 gag (1 > 60131 Site and Sequence ' & P^0e/¥ 

AOArCTGGGGGAGGCAGGAACCTCCTCAGArTTTCTCACAGACCCrTCCCAATTCCATCACCACTuCCAACaACTCCrCCCCCAGAGArc TGGC TCGAGC 
TC fAGACCCCCTCCGTCCT TCGAGGAGTCTAAAAGAGTG rCTGGGAAGGGTTAAGGTAGTGG rGACGGTTG TTGAGGaGGGGG TC TC TAGACCG ACC ~C "!i 

pHH15 ■ 1 

£ < W G R 0 £ A P Q i ? S 0 T L P N S ( T T A N N S S P R 0 I A G A 

CCAGAAAAAGAAGCATGrGGrTTAAAAAATGTTrAAATCAATCTGrAAA^GGTAAAAATGAAAAAACAAAAACAAGCAAACAAAC AAAAAACAATGGAAA 
GGTCTTTTTCTTCGTACACCAAATTTTTTACAAATTTAG ^TAGACATTTTCCATTTTTACTTTTTrGTTTTTGrTCGTTTGTTTGrTTTTTGTTACCrTT 
Q * *, * H V V . K H F K S I C K ft K K N K N K Q T N K K 0 V I 

AGATGAAGCTGGAGAGAGAGGAACCAGTTGCCAAGGTAGAGAGCTGCCCGCTCCrGCCCTCrGGATGACArAGGGGACATCAACA AGACGGCTGCCAAC: 
TCTACTTCGACCTCTCTCTCCTTGGTCAACGGTTCCATCrCTCGACGGGCGAGGACGGGAGACCTACTGTATCCCCTGTAGrrGrTCTGCCGACGGTTGu ^ ' 
* • S 'V ft E R N Q L P R . R A A R S C P L 0 OIGOINKTAAN 

TGAGAAG TCACC AAACC ACAAAAATAACCTTaCaGCC TTCAGGGAAAGAC TACCAGCTCTG TCTTTC TACCCTCTAATTTAA CAA TGCACCGGAATTCAo 
ACTCTTCAGTGGTTTGGTGTTTTTATTGGAATGTCGGAAGTCCCTTTCrGATGGTCGAGACAGAAAGATGGGAGATTAAATTGTTACGTGGCCTTAAGTC 

Linker? - 

LR SHOrt KIT LCPSGKQ yQL CL S T U . F N N A P £ F $ 

CrTGGACTTAACC 

. 1— *■ 6013 

GAACCTGAATTGG 

— 2 

linker? ' 

w 0 L 7 



WO 98/24810 112 PCT/EP97/06956 





a- 
§ 

li 



8|. 



S4" 

§3» 



WO 98/24810 PCT/EP97/06956 

lib 



h~ - ~r«~ - S^l ' d* A< 



CirArcrGCACAArTCGGcrTCTTrGAGCAicrrcAsccrGGrTAAGTCCAAGCT'GAArrccjGGGA-iAGccGAGccccArcccrccicOAcccTArGc-; 

CTA-AGACGTCTTAAGCCGAAGAAACTCGTTCAAGrCGGACCAATTCAGGTrCGACrTAAGGCCCCriTCGGCrCGGCCTAGGGAGCTCCraGGATACGC 



' pCR2.1 linker ' lambda gl 10 primer EooR I I suspect sequence linker? ^-pHHl4-3 — 

t S A E F 6 F F S 0 V 0 P G VQAEFRGiCPSR I PRgp YA 

GaSGTCAAGCCGCTCAGCAAGCCGCCTGA^GCGGCCGTGAGCGAAGAT^^ 

CTCCAGTTCGGCGAGTCGTTCCGCGGACTTCGCCGGCACTCGCTTCTACCGTTTAGCCTGCTGCTCGACGAGAGGTCGTTCCGGTTCCGCGTTTTCTCGA ^ 



•PHH14-3 



^ V < P t SK A P E AAV S £ D GK SO 0 E L L SSK A K A Q K 5 ' 

C TG'jGCC TGTCCCCTCTGCCAAGGGCCAGGAGuAGCGCGCCTTCC TCAAGGTGGACCCCGAGCTGGTGGTGACCGTGCTGGGAGACCTGGAGCAGCrGCT 
GACCCGGACAGGGGAGACGGTTCCCGGTCCTCCTCGCGCGGAAGGAGTTCCACCTGGGGCTCGACCACCACTGGCACGACCCrCTGGACCTCGTCGACGA 



-pHH14-3 



5 G P V P S A K GQ££RAFLKV0PELVVTVLGOLEQtL 
CTTCAGCCAGAfGC TGGACCCAGAGTCCCA GAGAAAGAGGACAGTGCAGAATGTCCTGGATC TCCGGCAGAACCTGGAAGAGACCATGTCC AGCCTGCGA 

' 1 i i i ■■ i i i i i i < i i t m^,-, 

GaaGTCGGTCTACGACCTGGGTC TC AGGGTCTC"TTC TCCTGTCACGTCTTACAGGaCCTaG aggccgtcttggaccttc TC TGGTACAGGTCGGACGCT 



— — -DHH14-3 

g 

I — — 

I 

ORF ( 1 -573t;p i Z pLM? ORF 



1 — full available ORF HU-Unc53/1 a pLMl OR 

FSQMLOPESORJCR TVONVLOLRONLEE TMSSLR 



T CCC AGGTGAC TC ACA3C TCCC ~GG AGA* 3-" T3C TACCACAGCGA T5A TGCC A A CCC AC GCAGCG TGTCCAGCC TC TCC A AC CGC TCG TC C CC TC 
CCCAGGGTCCAC rGAGTGTCGAGGGACC TC TA-Z "GGACGATGC TGTCGCT AC TACG3TTGGG TGCG TCGCACAGGTCGGAG AGGT TGGCGAGC AGGGGA3 



-pHHt4<3 





— OP.= { 1 -575oo) a pLM7 CRF 




: 5 0 v r « 3 s L E ^ 


.'-U available ORF HU-Unc53/i s pLM! OR 

" :yOSDDa.N=>3S7S3L S N R 3 5 p 


-3 ~ : A rc GCGC r A TGG CC A 3 7 r C AG "CCGC GGC * 


::A2GCTG3rGACGCGCCCTCT:"G3GTGGGAGCrGCCGCTCGGAGGGGACGCCCGCCTGGTACfi: 


jTaccgcga'accggtcaggtcaggcgc : 


-GTCCGACCACTGCGCjGGAGAC aCCC-CCC "CGACGGCGAGCC TCCCC TGCGGGCGGACCA TGTa 



PHH14-3 



















OR F M • S 7 vc ?; = pC .V .* OR .= 










- « a-, jilot!* CRF MU UncSj/: = pUMl C- 




U > '* = < G 




•; 3 : 


: a c a p > ; g ; 


IPSEGTPAV^r* 
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Tuesday. t8 November 1997 10:33 ^ 9 ^ p ago i. 

fig Hu-UncS3/1 seq (1 >6013) Site a no xuenco 

c>: acggcgaacgggccc ac fac tcccac ac catgcc: atgcgcagccccagcaagc rcAGccATArc rcccGccrGGAGC TssrcGAArccc rGSArrcs 

CO'GCCGCTTGCCCGGGrGArGAGGGTGTGGrACGGGrACGCGTCGGGGTCGTrCGAGTCGGTATAGAGGGCGGACCrCGACCAGCTTAGGuACCTGAG: 



-pHH14-3 



- ORr { 1 -579DD) a pLM/ ORF 



■full available ORr HU-Unc53/1 s pLM! OR 



HGERAHYSMTMPMRSPSKLSH ISQLELVESL03 
GArGAGGTGGACCTCAAGTCCGGCrACATGAGCGACAGrGACCTCATGGGCAAGACCATGACGGAGGATGATGACATCACTACCGGCTGGGATGAAAGCA 

i i ■ ■ ' ■ 1 ■ • ' ■ ' - ax> 

C 7ACTCC ACC TGGAGTTCAGGCCGATGTAC TCGC 7GTCACTGGAGTACCCGTTCTGG 7 AC TGCC TCCTACtAC TGTAG TGATGGCCGACCC TACTTTCG7 



» P HHl4-3 



•ORF (1«5790p) a pLM7 GRF 



-full available ORF HU-UncS3/1 » pLM! OR 



CEVOLKSGrMSDSOLMGKTMrEOODI TTGVOES 
GC 7CCATCAGTAGTGGACTCAGCGATGCCTCAGAC AATC TCAGTTC AGAAGAATTCAATGCCAGCrCCTCACTCAACTCCCTCCCAAGTAC TCCCACTGC 

■ ■ \ i . i 1 h — i » i — i • -?:o 

CGAGGTAGrCArCACCTGAGTCGCTACGGAGTCTG7TAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGTGAGTTGAGGGAGGGrTCATGAGGGTGAC3 



•pHH14-3 

t 



-pCB212 



•2RF ;i-57&tpj a pLM7 ORF 



•:-JI available ORF HU-Unc53/i = pLMl OR 



5SI5SG*. S0AS:SL5SEEFNA3SSLNSLPSTPTA 

~ "3 7CGC AGGAACTCAA3AA7AG TZZ "ACGCACA-j-C 73 AGAGAAGCGCTC ACTGGC AGAAAGTGGGCTGAGC TGG TTTAG 7G4ATC AGAGGAGAAAGC3 

i • • ■ ' 

--:A3CGTCCTTGAGTTGr7AT:AC:ATGCGrG'C*GAGrCTCTTCGCGA3rGACCG7CT7TCACCCGACrCGACCAAATCACrTAGTC7CC7CTTrCGG 



-PHH14-3 



-PCB212 



„:i available ORF HLNUnc53/l = pLM1 OR 



: P R N 5 7 V , 5 " 3 3 E K R 3 L A E S G L S v F S E 5 £ E * A 
C»: "AAAAAAC FGG AG r A3 3 A3 A3 733 "A2CC73--jA "3GAACC T3GG AC r TC TAAG 7GGCGGAGGGAGC3GC 3 7GAGAGC 73 7GATCA 7 r ; A T3C AAG3 

"rrrr :g ac crc-r:c r*3 tcacc atcggaz " ~i7A3crrGGACCC fgaagattcaccgcc rccc tcgccggactc tcgac ac rAC'AA37A3GT7c: 



•pHHIA-3 



-PCB212 



a*j!latlrfORr HU-Unc53/l s?LMl OR 



e ? 3 r $ ► » p -: •> :> v -: : o c- ; s > 



WO 98y24810 1 1 7 PCT/EP97/06956 



Tuesday. 18 November t997 10:33 ^74 9 P4§aJ 

Ug Hu-Unc53/1 seq (1 >6013) Site af sequence J 

G r 2GAGAAC TGAAAAAGCCCATC AGCC T3GGCCACCC TGGT TCCC 'GAAGAAGGGCAAGACCCCACC TG TQGC TO TaaC f fCCCCCA TC AC ^CACACag; 
CiCCrCTTGACTTTrTCGGGrAGTCGGACCCGGTGGGACCAAGGGACrrcrrCCCGTTCrGGGGTGGACACCGACATrGAAGGGGGTAGTGAGTGTGrca '" w ' 



•pHHU-3 



•PCB212 



-full available ORF HU-Unc53/T = pLMI OR 



GSELKXPISLSHPGSLKKGKTPPVAVTSPITHTA 
CC AGAGTGCCC TCAAAGTCGC AGGC AAACC TGA3GGC AAAGC TACAGACAAGGGTAAGCTfGCAGTGAAGAATACTGGGC TCCAACGCTCC TCCTCTGA" 

1 ill > i I » |ii t i | | j ■ i r J V , 

GGrcrCACGGGAGTTrCAGCGTCCGTTTGGACTCCCoTTTCGATGTCTGTTCCCATTCGAACGTCACTTCTTArGACCCGAGGTTGCGAGGAGGAGACTA 



•PHH14-3 



'PCB212 



Mull avaifaWe ORF HU-UncS3/1 a pLM1 OR 



OSALKVAGKPEGKATOKGtCLAVKNTGUQRSSSO 
GCTGGTCGGGACCGCC TGAGTGATGCTAAGAAGCCCCCC TCGGGCATTGC TCGCCCC ICC AC TTCGGGA TCCT TTGGC T AC A AG A AGCC TCC TCCTGCC 

*- . ' ' ' ' ■ ' ■ * ruc<; 

CGACCAGCCCTGGCGGACTCACTACGATTCTTC3GGGGGACCCCGTAACGAGCGGGGAGGTGAAGCCCTAGGAAACCGATGTrCTTCGGAGGAGGACGG- 



-pHH14-3 



■PC8212 



-iu:i available ORF HU-Unc53/l = pLM1 OR 



-G«CPLSOAJCK?=>SG I ARPSTSGSFGYXKPPPA 

C A3GC AC AGCCACTGTC ATGCAAAC TGG TGG "TCA5CCACTC TCAGCAAGATCCAGAAGrCC TCAGGCATCCC TGTCAAGCC AGT AAATGG3C3CAAGA: 
G':CGTGrCGGTGACAGTACGTrT3ACCACCAA:-;32TGAGAGTCGTTCTAGGTCrTCAGGAGTCCGTAGGGACAGTTCGGTCArTTACCCGCGTKT3 



•pHHU-3 



-pCB2l2 



■fwil available ORF HU-UncS3/l = pLMI OR 



" 3 T A T V M G F G 3 S A TL 5 K IQK5SG I P V K P V N 3 3 > » 

rrA3ATG*rr::AACAGrcCA3A3:cA5:A*-;: rGGcrccTGGAGcccGTTc faaca rccAG taccgc agcctgccccggccagc: aag'c aa-j- 

a *:GAAT3-ACAAA33r7GrCA:Gr;TC3GT-::-iA3:ACCGAG3AC:rCGGGCAAGArr3rACGTCArG3CGrCGGACGGG3CC3GTCG3rrCAGr:C- 



*pHH14.3 



•PCB212 



a* a: la bid CRF HU-UncSO/l npLMlOR 

• a - G A R 5 N I 0 Y J? 5 L * B - A 



WO 98/24810 PCT/EP97/06956 
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Tuesday. 16 November 1997 10:33 
ftg Hu-Unc53/1 seq (1 >6013) Site a» sequence 

rcr a rGAGCOrGACCGGCCGGCCCuGTSSACCrCGCCCTGTGAGCAGCACCATTGACCCC A3 TC TCC TCAGCACCAAGC AGG3A3GCC 77ACGCCTTCC A 

agatactcgcac tggccgcccgccccacctggagcgggacac tcgtcgtcg taac tgggg "cagaggagtcgtgg ttcgtccc tccggaatgcggaagg* 7: *' 

pHH14-3 — 

pCB212 

pCB210M4 

iull available ORF HU-UncS3/1 a pLMl OR I 

j - f yv crinwr HU53r.M L 
SMSVTGGRGGPRPVSSSIOPSLLSTKQGGLTPS 

GACTGAAGGAGCCTACCAAGGTAGCCAGTGGGCGGACCACTCCAGCCCCTGTCAATCAGACAGATCGGGAAAAGGAGAAGGCCAAAGCCAAGGCAGfGGC 
CTGACrTCCTCGGATGGTTCCA7CGGTCACCCGCCTGGTGAGGTCGGGGACAGTTAGrcrGTCTAGCCCTTTTCCTCTTCCGGTTTCGGTTCCGTCACC3 



pHH 1 4-3 

pCB212 1 



PCB210-14 



.'ull available ORF HU-Unc53/1 = plMl OR 

PL KEPTK V ASG3 T TPAPVNQTQPEKEKAKAK A V A 

C 7 7CGAC TCAGACAACATC TCCTTGAAG AGTAT7GGC TCC CC AG AAAG TAC TCCCAAGAACC AAGC AAGCC ACCCCACAGCCACCAAGCTGGCAGAGCT j 
GAACCrGAGTCTGTTGrAGAGGAACTTC TCA7AACC3 A3GGG TC TT TC AT G AGGG TTC T TGG TTCG TTCGG TGGGG TGTCGGTGGTTCGACCGTCTCGAL* 

— pHHl4-3 — — 

PCB210-14 ■ 



!ull available ORF HLM,'nc53/i = pLMi OR 

L 0 30 NI5L.<5:GSPESTPKMOASHPTATK LAEL 

ccaccaacccctctcagsgccacagcgaagagc'- tg^caaaccaccc tc actagccaatc t tgac aagg tcaac tccaacag re tgca tc t acca TC A" 

GG "03 rTGGGGAGAGrcCCGGTG TCGC T TC TCGaaaC A3 TT TGG TGGG AG TGATCGG T TAG-AC TG 7 TCCA3T TGAGG T 7G TC A3ACC 7AGA7GGT AG TA 

pHH14-3 

-pCB2lO-14 ■ 



£>3 °t 



Pa^e Y 




.ti a. arable ORF HUUnc53/l = pL.V.t CR 

'* * P => 5 L A N j < V N 3 N S L 0 U P 3 



WO 98/24810 PCT7EP97/06956 
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Tuesday. 18 November 1997 10:33 t i 4 $ Pace 5 

fig, .-UMJnc53/1 seq (1 > 6013) Site . .Sequence b 

ccAorGArACCACCCAT3CT7CAAA3GrcccAuA7crGCATGCTACAACcrcACCArc732GuGccc rc rrccrTccTarrrcAcrcccAG tccggcac:: 

GG "C AC T A rGGTGGGT ACGAAG T TTCC ACGGTC TAGACG TACGATOTTCGAGTCGTaGACCCCCGGGAGaGGGAAGGaCGAAG TGjGGG TC AGGCCGTGG " ' * 



-pHHl4-3 



-PCB210-14 



-full available ORF HU-Unc53/i a pLMi OR 



SSDTrXASXVPOLHATSSASGGPLPSCFTPSPAF 

C ATCC TC AATATTA4C TCAGCC AGC T TC TCCC A3GGCC TGGAGCTAATGAGTGGTT TC AG TG fGCC AAAAGAGACCCGCA TG TACCCCAAAC TCTC AGGL* 

11 1 i ' ) ■ I i i i ; i , , , , T }2C' 

GTAGGAGTTATAATTGAGTCGGTCGAAGAGGGTCCCGGACCTCGATTACTCACCAAAGTCACACGGTTTrCTCTGGGCGTACATGGGGTTTGAGAGrcCG " 



-pHHH-3 



»pCB210-14 



■full available ORr HLMJncS3/1 s pLMi OR 



llN I NSASF5QGLEL MSGFSVPKE T R M V P K L S G 
CrGCACAGGAGCATGGAGTCCCTCCAGATGCCAATGAGCCrCCCCAGrGCCTTCCCCAGCAG TACTCCCGTCCCCACCCCACCTGCTCCCCCTGCTGCTC 

1 1 1 1 I ■ 1 I ' I ' i ■ ■ i < i i | V"! 

GACGTGTCCrCGTACCTCAGGGAGGTCTACGGTTACrCGGAGGGGTCACGGAAGGGGTCGTCATGAGGGCAGGGGTGGGGTGGACuAGGGGCACGACGAu 



■PHH14-3 



"PCB210-14 



-f'jil available ORF HU-Unc53/1 = pLMi OR 



LHRSMESLGrf^KSLPSAFPSSTPVPrPPAPPAA 

CC ACAGAAGAAGAG AC3GAAGA3CTG AC TTGG -3 7GG AAGCCCC AG AGCT3GGCAAC TGGAC AGTAATC AGCGGGATCGG AAC AC TC T7CCC AAGAAAG2 
GG TGTC T TC T TC TC T3CCTTC TCGAC TG AACC T: ACC TTCGGGGTC TCGACCCGTTGaCC TGTC ATTAG TCGCCCTAGCC TTG TGAGAAGGG TTCT TTCZ 



-PHH14-3 



-pHH3b 



-PCB210-14 



-:ull available CRF HU-Unc53/l = pL.V?i OR 



J 1, t 



rev snirw :-vJ5?.-\3 I ; pr-jref HUMrvZ 



-ceptid2 5/2S23H 



r £ £ £ r £ t * i z s p a a g q l o s n c r o a % i r l => < k 
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fig_ Hu-UncS3/1 seq (1 >6013) Site arv .equertce _ 

GC 7C A3G fACCAGC TTCaG TCCCaGGAGGaGaCC AaGq AGAGGCGACATTCCCATACC AT TGG TGGGCTGCCTGAA 7CCGA TGACCAG 7CA5AGCTGCC7 
CGAGrCCArGGTCGAAGTCAGGGTCCTCCTCr-3GrrcCTCTCCGCTGrAAGGGTATGGTAACCACCCGACGGACTTAGGCTACrGGTCAGTCTCGACGGA 



-pHH14-3 



-pHH3b 



-r«y primer HUSJr/l 



-,'ull available ORF HLMJncS3/1 = pLMl OR 



LaYQlQSQEETKERRHSHTIGGLPESDOQSELP 



•CTCCCC CTGCACTTCCCATGTCTCTGAGTGC^AAGGGCCAACTTACCAACATAGTGAGTCCCACTGCGGCCACCACGCCAAGAATCACCCGCTCCAACA 

' • 1 — ' 1 ' — - — H 1 — ' — 1 ' 1 i ' ' i I » i ■ i 

AGAGGGGGACGTGAAGGGTACAGAGACTCACGTTTCCCGGTTGAATGGTTGTATCACTCAGGGTGACGCCGGTGGTGCGGTTCTTAGTGGGCGAGGTTGT 



-pHHH-3 



-pHH3b 



•full available ORF HU-UncS3/i m pLM1 OR 



SPP ALPMS L SAKGQL T N I V S P r A A TTPR [ TRSM 

GC AfCCCCACCCACGAGGCGGCC TTCGAGCTG'aC AGC 3GCTCCCAAATGGGGAGCACCC tg TCCC TGGCCGAGAGACCC AAGGGAATGaTTCGGTCAGS 

1 ■ i 1 ' 1 ' ■ i ■ * i ► 27cr 

CGrAGGGGTGGGTGCTCCGCCGGAAGCTCGAC^-GTCGCCGAGGGTTTACCCCTCGTGGGACAGGGACCGGCTCTCTGGGTTCCCTTACTAAGCCAGTC:: 



•pHH14-3 



■pHH3b 



available OR? HU-Unc53/1 a pLMl OR 



^' p TH£A Ar£ L'SG30M G STLSLAERPKGMIRSG 

-•:cTrc:GA GACCCCACGGACGArgrrcACG r^cTGrcccrGGCCTccAGTGCcrccrccACCTAc rcc tcagc tgaggagaggatgcaatct 
' — > ■ — • 1 ' , , j , , | v; 

^5GAiG3CTCrGGGGTGCCTGCTACAAGTGCC:iGT:A:GACAGGGACCGGAGGTCACGGAGGAGGTGGATGAGGAGTCGACTCCTCrcCTACGTrAGA 

■i 



-PHH14-3 1 



-pHH3b 



a vn, ladle ORF HU-Unc53/i = pLMl OR 



BDPTOCvHjS f'LSLASSASST YS34EE4M0 



'j>ioC AAA TCCGG AAGC 7?C 37AGG3AAC TGGa- * ~ a " d Z AG GAA a a AG TGG CCACC f TG ACG TCTC AGC 1* 7TC TGCCAA TGC 7AATC TGG 7GGCTGC T " 
C ': , :'yT^GCCrrCGAAG:ATCCCr73ACC"i:-A::37CCTTrrr:ACCGG7GGAACTGCAGAGTCGAAAGACGGTTACGATrAGACCACCGA:GAA 



* pHH3b 



a. * a tie ORF HUUnc53/t = oLM 1 OR 



I 3 < *. = , E v v A T L T S 0 L S - N a m L V A A 
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££2_ Hu-UncS3/1 seq (1 > 601 3) Site and Sequence f 

rTGAGCA3AGCCTGGrGAArATGACATCCCGCCr0C3ACACCrGGCAGAGACGGCCGAGGAGAAGCACACrGAGC TGC fGGA TT rGCGAGAAACCAf^A 

A AC fCGTCrCGGACCACTTATAC TGTAGGGCGGACGC TCrGGACCGTC TC TGCCGGC TCC TC TTCC TGTCACTCGACGACC rAAACGCTCrTTGGTATt"? ^ 



-U2 0RF = pCS2S? ORF 



• pHH3b 



•full available ORr HU-UncS3/1 a pLMI OR 



FEOSLVNMTSRLaMLAETAeEKDTEL 



L 0 L R £ r I 0 



CTTTC rG^AGAAAAAGAACTCTGAGGCCCAGGCAGTCATTCAGGGAGCCCTTAATGCCTCAGAAACCAC ACCCAAAGAACTrCGGATCAAGAGACAAAAC 
GAAAGACT7CTTTrTCrTGAGACTCCGGGTCCG7CAGTAAGTCCCTCGGGAATTACGGAGTCTTTGGTGTGGGrTTCTTGAAGCCTAGrrCTCTGTT7T3 C ' 



•ua grf a pcezsi or-.? 



■ pHH3b 



-full available ORF HU-UncS3/1 = pLMI OR 



F I * N S E A Q A V | OGALNASETTPKELR I X R 0 fl 

rCCTCAGA-AGCATCTCAAGCCTCAACAoCATCACTAGCCATTCCAGCATCGGCAGCAGCAAGGArGCTGATGCG AAAAAGAAGAAAAAAAAGAGTTGG, 
AGO AG TC TATCG TAGAG TTCGGAG T TGTCG TAG TG A TCCGT A AGGTCG TAGCCGTCGTCGTTCCTACGACTACGCTTTTTCTTCTTTTTTfTCTCAACC 



■U:2 '"M- « pCS25l OBf- 



•pHH3b 



available ORF HU-UrtcS3/t = pLMt OR 



* S 0 5 ISSLNS ITS HSS IGSS K DA0AKKXKKK3V 
■C7AT3A3CTTCGAAGTTCCTTCAACAAAG CG77C AG 7A TAAAAAAGGGGCCCAAG7CAGC T TCCTCATAC TCGGATATAGAGGAGA T7GC TAC ACCCGA 

1 1 1 i ' i > i i i I ■ . t T- " ,* 

^Ga7ACTCGAAGCTTCAAGGAAGTTGTTTCGC.iaG7:aTATTTTTTCCCCGGGTTCaGTCGAAGGAGTATGAGCCTATATCTCCTCTaaC3ATGTG3GC- 



-U2 OSr = pC32 ; i: ORF 



•pHH3b 



•!ul available ORF HU-Unc53/i = pLMi OR 



v y £ I 3 S S N < A - * 1 KXGPK$ASSYSOC£ElATPC 

crcrrcAGcccccrcAT:ccccAAAC7Ac^Gc^o:*':TACAG^GAC7GCTTCAcccrcc.iTCAAGrccrcc ACC t to 7cc t: : g r ■;<:•■;: ac t: a ro r : 



-pHH3b 



•--.! a.iiiableCRr HU-Unc53/i =?l.VH OR 



? 3$p<.o-:^r£r 1 i3Psi>33rL3S v:T0 



WO 98/24810 PCT/EP97/069S6 

122 
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fir H u-Unc53/1 seq (1 >6013) Site ana Secuence 

ACCL .GCCCTCCTC ACCCACCCCCCCACAC TaGGC TG TTCCA TGCAAA TGAGGAGG AGGAGCCAGAQAAGA AGCAGG7 a TC5G AGC "GCuC TCTGAG^ 



rGGCrcCCGGGACGAGTGGGTCGGGGGGTGTGA7CC3ACAAGGTACGTTTACTCCTCCTCCTCGGTC TC T TCT TCC TCC A 7 AGCC TCGACGCGAGAC TC2 " * " 



-U2 ORF = pCB2St CRF 



* pHH3b 



-full available OR? HU-Unc53/1 a pLMt OR 



TEGPAHPAPHTRUFHANEEEEPcKKEVSELRSE 

TA TGGGAGAAGG AAATG AAGC TT AC AG AC A TCCGC TTGGAGGCCC TCAACTCTGCCCACC AACTGG A TCAGC TTCGGGAG ACC A TGC AC AACATGC ACT r 
ATACCCTCTTCC TTTACTTCGAATGTC TGTAGGCGAACC TCCGGGAGTTGAGACGGG TGG TTGACCTAG TCGAAGCCC TC TGGTACG TG 7TGTACG TCAA 



• U:2 ORF a pCB25i ORF 



-peptide B72627H 



- pHH3b 

3 



-lull available ORF HU-UncS3/1 *> oLM1 OR 



■U3 ORF = pLMS ORF 



LV EKEMK L TD t3 LEAL NSAHQLDQLftE TMHNMQL 

GGAGG rGGACC TGC TGAAAGC AG AG AArGACCGACTGAAGGTAGCCCCAGGCCCC TCATCAGGCTCCAC TCCAGGGCAGG TCC C TGG A 7C A TC TGC ATT A 
CCTCCACCTGGACGACTTTCGTC TCTTACTGGC'GaC 7TCC ATCGGGG TCCGGGGAGTAG TCCGAGGTGAGGTCCCG TCC AGGG ACC TAG fAGACGTAi* 



->JZ ORrapC825i OR? 



•pHH3b 



-:-,« available ORF HLNUnc53/1 = pLMl OR 



-U3 ORF s pLMS ORF 



E v 0 L L K A £ ,N 0 - L K V a P G P S S G 3 T P G 0 V P G i S A L 

rcrTCCCCACGCCGCrCCCTAGGCCTCGCAC'"^: :: ATTCCTTCGGCCCCAGTCTTGC AGACACAGACCrGTCACCCArGGATGCC^'CAg'ACTTGT:- 
AGAA3GGG TGCGGCGAGSG ATCCGG ACC 5 TGA 3 "G3 "aaGGaaGCCGGGG TC AGAACG TC TG TG 7 C TGGACAGTGGGTaCC TACCGTaGTCaTGAAC*:: 



■U2 OP.F = pCSZS\ ORF 



• pHH3b 



.-! a,a:lacle ORF HU-Unc53/1 = oLMl OP 



-U3 ORF 3 pLMS ORF 



iSPPPSLGLA '. "-SF5PSUACT:-. SPH3 



WO 98/24810 PCT/EP97/06956 
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tit, ' u-UrtcS3/1 seq (1 > 6013) Site ana Sequence 

G 'ZCAAAGGAGCAAG TGACCC TCCGGG TGG T 3CG A33A rGCCCCCGC AGCACATCATCAAAGGGG AC T TGAAGCAGCAGC AA T TC TTCC rGGGCT37A3 

CA GGrrtcc tccttcac tcggagccccaccaccac tcc tacgggggcgtcg tg tagtagtt recce tgaacttcgtcgtcc r ta agaaggacccgacatc 



pHH3b 



iutl available ORF HU-UncS3/l a pLMl OR 



U3 ORF a pLMS ORF — — = -= IZZ ZZZ^ZZir^Z 

G ? K E E VTLRVVVRMPPOH t IKGOLKQO£FFLGC 



C AAGGTC AGTGGAAAAG TTGACTGG A AG A TGC T3GATGAAGC TGTTTTCC AAGTGTTCAAGGaCTaTa7TTCTAAAATGGACCCAGCC7CTACCCT5GGa 
0 7 TCC AG rCACC TTTTCAACTGACC 7TCTACGACC TAC T TCGACAAAAGGTTCACAaGTTCC TGA7A TaAAGATTTTACC TGGGTCGGAGATGGGACCCT 



U2 ORF a PC8251 OR? 



pHH3b 



fun available ORF HU-Unc53/1 » pLM 1 OR 

U3 ORF a pLMS ORF — 

KVSGKVDVXMLOEAVFOVFKDri SKMOPASTLG 

C TAAGCACTQAG TCCATCC ATGGC fAC AGC A "CaGCC AC 3TGAAACGAGTGTTGGATGCAGAGCC CCC3GAGATGCCTCCTTGCCGTCG A55TG TCAATA 
GArTCGTGACTCAGGrAGGTACCGATGTCGT < i3-CG37GCACrTTGCTCACAACCTACGTCTCGGGGGuCrCTACGGA GGAACGGCAGC7:CACAGT7A: 

— U:? Oftr n pCag; 1 Oft = — 



pHH3b 



' U4 ORF = pCs-201 ORF 



:•.:( avarlatle ORF HU-Unc53>'1 = pLMl OR 



U3 ORF = pLMS ORF 



1 pHHlS 

SrESlHGY5!S.-4VKAVL0A£O>cNPPCR:GVN 



^■;a "A rc AG TC 'CCC TC AAAGC "C r GAA3CAjAA-";^3 TCG ACAGCC TG j fG r "CGAGACGC TGA "CCCC AACCCGA TGA TGC a;c AC "-CaT aa gcc * 

:o-ArAcrcAGAGGGAGrTrccAGAcrr::-:*--A:::AGc re rcG3Ac:ACAAGC T cr3CCAc-A::.33 r r:cGcrA: -a:: a-:--a-t:gga 



A» ? 
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fig ( u-UncS3/1 seq (1 > 6013) Site arx» Sequence 

C'Z' .Z rGiAGCACCGGCGCCTCG TCC rc TCGGGCCCC A3CGGCACGGGCAAGACC TACC r 3ACCAA r CGC r "GGCCGaG ~ACC TGGTGGAGCGCTCTGG- 
GGACG AC T rCGTGGCCGCGGAGC AGG AG AGCCCGGGG TCGCCGTGCCCGTTCTGGATGGAC FGG T T AGC GAACCGGC TO a 7G3ACCACC 73GCGAGACC3 * 



•U2CR?=ipCBcS1 ORF 



-pHH3b 



-U4 ORFapCBSQI ORF 



-tu\l available ORF HU-UncS3/1 =* pLM1 OR 



-U3 ORFapLMS ORF 



-pHHlS 



LLKHRRLVLSGP3GTGtCTYLTNRLAEYLV£RSG 

CGTGAGGTCACAGAGGGCArCGrCAGCACCTTCAACATGCACCAGCAGTCTTGCAAGGArCTGCAACTGTATC TTTCCAACCTAGCCAACC AGATAGACC 
GCACrCCAGTGTCrCCCGTAGCAGTCGTGGAAGrTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGACArAGAAAGGTTGGATCGGTTGGrcrATCTGG 



•U2 ORF = pCBSSt ORF 



- pHH3b 



-U4 ORF«pCB201 ORF 



-full available ORF HU-Unc53/l = pLM1 OR 



»U3 ORF a pLM5 ORF 



-pHHIS 



-H v:£3iV3T-NMHO0SCK0LQLYLSNUAfJO!O 

GuSAAAC-GG-AT T 33G3A TG T3C3CC r 3G 7G A"7CTAT rGG aTGaCC TGaGTGAAGCaGGC TCC aTCAGTGAGTTGGTCAATGGGGCCC TCACC TGCaA 
-■-Z "TTG rCC r 'AACCC I T ACACG33G ACC - - "AiGATAACC fAC TGGaC TCACfTCG 7CCGAGG1"AG TC AC TC A ACC A3 7 rACCCCGGGAG TGGACG T7 



•U2 ORF « pCBSSI ORF 



■pHH3b 



-U4 QRF a pCB201 ORF 



•vi available ORF HU-Unc53/! = pLM: OR 



• U3 ORF s pLM5 ORF 
pHHIS 



i 

CI 



■:':•! : : v a . , . c d l s e a g 5 3 £ l v *i g a . r c » 
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tig Hu-Unc53/1 sag (1 > 6013) Site ' Sequence 

g -a rcA . i^A tg rccc mtat ta tagg r accac: aa r: a gcc rg t aaa aa tgacaccc aacc a tggc r tqc ac r rgAgc ~ rc ag gatg r 73 acc -t;*cl 

C w : AG r* TT TaCAGGuATATAATa rCCATGGTGG 7 7AGTCGGACATTT7T AC rGTGGG TTGG rACCG AACG TGAAC TCGAAG TCC TACAAC TGGAASAGU 



U2 ORF »pCS2S1 ORF 



pHH3b 



U4 ORFspCB201 ORF 



full available ORF HU-L'ncS3/1 c pLMl OR 



U3 ORF « pLMS ORF 



pHHlS 



: ?op;ido B72S20H 

YHKCPYI (GTTNQPVKMTPNHGLHL3FRML T F 3 



AACAACGTGGAGCCAGCCAArGGCrTCCTGGrTCGTTACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCGACATCAATGCCAACAAGGAAGAGC TGCTT: 
r-GTTGCACCTCGGTCGGTTACCGAAGGACCAAGCAATGGACTCCTCCTTCGACCATCTCAGTCTGrCGCTGTAGTTACGGTTGTTCCTTCTCGACGAAG 



U2 ORF a pC825l ORF 



pHH3b 



U4 ORF = pC8201 ORF 



full available ORF HU-UncS3/1 = PLM1 OR 



U3 ORF « pLMS ORF 



— P HH15 

\ H V c P A M G F L V 3 fLRSKLVESDSO (NANKEELL 



■j- j J "3CTCGiC~G3o F AZCCAAGC fG TGGT a7.~a "C "C C AC ACC T TCC 7T3AGAAGC ACAGC ACC TCAGAC TTCC TC <*TCGGCCC TTGCTTC TTTC TC T 1 
CCC AC3A jC TG-CCC A f 3G G TTZGAC AC CA T AG'aGAGG TGTGGAAGGAAC TCTTCG TGTCG TGGAGTC TGAAGGAG ~ AGCCGGGAACG AAGAAAGAC&2 



U2 ORF pCB£S1 Ofrr 



pHH3b 



U4 ORF = pCB201 ORF 



W available ORF HU-UncS3/1 =* pLMl OR 



U3 ORF 3 pLMS ORF 



— pHHlS ■ 

/• L C- - 7 ' < L •/--r*T-.£KH373CFL I 3 P C F L 



125 PCT/EP97/06956 



£5 ? 



[page 1' 
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fig . <J-Une53/1 seq (i >6013> Site ark ecuence 




age ti- 



•: >w aggg r - iCCG r a-c tcctgaaggcc tggacc aag t aac tggac acc T tg t tgag a tag r aaggg a tag atg r cc ~ rcc rc gg rice "accc ta t~ ti 



U2 ORFapCSS-St ORF 



pHH3b 



U4 ORF = pCS20< ORF 



full available ORF HU-UncS3/1 a pLM1 OR 



U3 ORF = pLMS ORF 



pHHIS ■ 

CPtGIEOFRTVF I 0 L V N N S I IPYLGEGAk3GII 



GrCCATGGACAGAAAGCTGCTTGGGAGGACCCAGTGGAATGGGTCCGGGACACACTTCCCTGGCCATCAGCCCAACAAGACCAATCAAAGCrGTACCAC: 

» ■ i 1 » i ' > ■ ' ■ ■ i i 

CAGGTACCTGTCTTTCGACGAACCCTCCTGGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCGGTAGTCGGGrrGTTCTGGrTAGTTTCGACATGG-G: 



U2 0RF = pCB£>i ORF 



pHH3b 



U4 ORF = pC8201 ORF 



full available ORF HLMJnc53/l = pLMt OR 



U3 ORF a pLMS ORF 



pHH15 

C :< a a v £ 0 ? '•/ £ v v 3 0 T L * v S a q 0 0 0 S :< L V H 



•j:c:cc~cccacc3t:ggccctca:agca::::c *c acctcccgagg at aggac ag tcaaagacagc acc aag ttc tc fggac tcagatccc tga* 

- ^ U G 3 SG TGGG 7G3 C a C CC GGG AG TG r C G T A AC GS a 3 7 3 3AGGGC "CC TATCC TG TCAG TTTC TG'CG'GGjG TTC AAG AG ACC TuAG'CT AGSAGACT- 



UZ OR;- ::: pGBZSI Onr 



pHH3b 



U4 ORFspCBcOl ORF 



available ORF HU-UncS3/l = alMt OF 



U3 ORF = pLMS ORF 



pHHIS 

P P r v G ? - S i a £ ? P £ 0 R T V K 0 3 : P 3 S L D 5 : - L : 
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«o Hu-Unc53/1 aeq (1 >6013) Site a Sequence ^ 

OGCCArGC'ccrGAAACTTCAAGAAGCTsc c^ACTACAr rcAsrcrccAGArccAGAAACCArccaoAcccc aacc r rc ag3c a ac ac t r taagggt rr 
C'jjGrACGACGACTTTGAAGTTcrrcGACGG'TGATGrAACTCAGAGGTcrACcrcrrrGGrAGGAccrGGGGrrGGAAGrccGrrGrGAAArrccCAii ■ >4 "'' 

> * 

. U2 CFiT a uCStt\ ORF .J 



-pHH3b 



-U4 ORF opC820l ORF 



5: 



•full available ORF HLMJncS3/l = pLMI OR 



3. 



• U3 ORF = pLMS QRF 



3 



-pHH15 



-p-jpticii 372025H 



AML LKL QE AANY 1ESP D RCTILOPNLQATL.GF 

GGCAATCACTG7CACCCCCGGACAGC AGAACGC'GGC ATCAGCTATCTTAGCTCCTCC TC TCCCC7C TCC 7CT fTCAGAGCAC TGGC TC TCC AGCCCC A3 
CCoTrAGTGACAGTGGGGGCCTGTCGTCTTGCGACCGTAGTCGATAGAATCGAGGAGGAGAGGGGAGAGGAGAAACTCTCGrGACCGAGAGGTCGGGGTC 



■pHH3b 



-pHHlS 



•3 .\ H C H ? R T AEPW HOLS L L L S ? L L F Q S T G 5 P A P 

G ~UGA3AAC AGCA3SGAGG AGGAGATGAAAGA33AGG 3ACAG37 TC 7T3 G TGCTGTACC77T3AGAAC7rcCT AGGAAGGAA TGG rGGGGTSGCGT TTG3 
C"CCrCTTGTCCr;cCTCCrcCTCrACrT7Cr.::*:::TGrcCAAGAACCACGACATG:AAACTCTTGAAoGATCCrTCCTTACCACCCCACCGCAAA«:C 



-pHH3b 



-pHHIS 



'j 3 E 0 i 3 G G 0 E * 3 3 T G S V C C 7 F £ H F L G R N G G V A F G 

0^crrGrGCCccc7AAACACA7TrAcr3GCC7::"c y AATGAcr:r3GG3AAAAGA-G A:r:TGGGT:rrrc:cr:GA:rrcrr37T-CAA:rACAA^; 

C ' ■CAACACGGGG3ATT7GrGrAAArGA:CGGA::i3ArrACrGAAACCCCTTT7CrA.:TAi3ACCCA3AAAGGGAACr3AA3AACAAAGrrAA-GrrT^ 



-pHH3b 



pHHlS 

K L >: 0 L N r F r G L ^ . L V G K 0 0 5 3 S F ? . L L V 5 t r fl 

"33G:r7TC:3333AGGGG77CACAAAA:^":.:^^ ^; t :c TGCAGC AG T TCC 'AAA T3 a r TC 7 Z A3AA3CAACCC "GAG A3 AG AC AG ~Z~~Z "GAGG3 

'i:;ACcc3AAAGAc::cTccccAAG7CT77T:'i:---r3rGACG7C3rcAAGGAT:r-:7AA3A.:*G***:3TrGGGAC7crc7c 7gtcaj aa3ac rcc: 

' 1. 



pHH3b 

pHHI 5 - 

A .- v 3 G V 0 * : ■ ■ m r ;J s 



WO 98/24810 1 2Q PCI7EP97/06956 



Tuesday. 18 November 1997 10:34 l>^% C ^ - 

fig Hu-UncS3/1 seq (1 >6013) Site and Sequence j ° Pa S Q 9 f 

AQa TC TGGGGGACGCAGGAAGCTCC TCAGATTT7C TCACAGACCCTTCCCAATTCCA fCACC AC TCCCAACAACT CC "CCCCC AGAGA7C7GGC TGGa»j~ 
TC T AG AC CCCCTCCGTCCTTCGAGG AG TCTAAAAG AG TG TCTGGGAAGGGTTAAGGTmGTGG TGACGGTTG TTGAGGAGGGGGTC TC TmGA^C"* AC~* 

_ 



PHH15 

£ IVGRQEAPQ trSQTLPNSI TT 



annssprolaga 



CCAGAAAAAGAAGCATGTGGTTTAAAAAATGTTTAAATCAATCTGTAAAAGGTAAAAATGAAAAAACAAAAACAAGCAAA CAAACAAAAAACAATGGAAA 

GQTCTTTrrcTrcT^ w 

0 K K K H V V , KMFKS1 C K R . K K N < N K Q T N K K Q V » ' 

AGATGAAGCTGGAGAGAGAGGAACCAGTTGCCAAGGTAGAGAGCTGCCCGCTCCTGCCCrcrGGArGACArAGGGGAC ATCAACAAGACGGCTGCCAAC-J 
TC7ACTTCGACCTCTC TCTCCTTGGTCAACGG77CCATC TCTCGACGGGCGAGGACGGGAGACC TACTG TATCCCCTGTAGTTG T TCTGCCGACGGTTGG 



S VRERNQL Pft RAARSCPLDO I G D INKT 



A A N 



TG AGAAGTCACCAAACCACAAAAATAACCTTACAGCCTTCAGGGAAAG ACT ACCAGCTCTGTCT TTCTACCCTCTA AT TTAACAA TGC ACCGGAATTCA3 
ACTCTTCAGTGGTTTGGTGTTTTTATTGGAATGTCGGAAGTCCCTTTCTGATGGTCGAGACAGAAAGATGGGAGATTAAATTGTTACGTGGCCTTAAG 

linker? - 

L R , S H Q T T , * | T L 0 P S G K 0 Y Q L C L S T L .FNNAPEF3 

C TTGGAC TTAACC 

— 6013 

GAACCTGAATTGG 

— 2 

HnKer? — 1 

L 0 L T 



WO 98/24810 
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GflflrTccccccwwccrcPcnccsaTccfiPcccccBCfiGCTCGCCAflcrccflcacTflflrcflCCGCCflrcccflflCfiCTcrrccWAcaflfia 100 

MSGCCLrvSGSPAAGQt.0SNORORNrLP«<CLA 
T»CCfl6CTTCBGTCCC«MGGPGflCCA«GG»GflGCCGflCflnCCCBTRCCflTrGGTGGGCTGCCTGflflTCCG«rGflCCflGrCBGaCCrCCCTTCTCCCC 200 

vouosocctk e o o h ' s m r tGGLPesooostuPSP 

CTGCftC TTCCCRTGTC TC TGPG'GCRPPGGGCCRRCTTBCCflBCRTRGTGRGTCCCRC TGCGCCCRCCBCGCCRRGRPrCBCCCGC TCCPRCRGCRTCCC 300 

p»LPnsis«*GOLrMivsprpRTTPBirasMsiP 

CPCCCPCCPGGCGGCC T7ZZKC TG TflCSGCGGC TCCCBRR TGGGGRGCBCCC TG TCCC TGGCCGPCRGRCCCRRGCGRRTGPTTCGG TCRGGRTCC TTC «00 
T M C R R f t L YSGSOHGSTL SLPCRPKC/IIPSGSr 

CCPGflCCCCPCGGRCGRTSrTC?CCGCTCRGTGCTGTCCCTGGCCTCCRGTGCCTCCTCCRCCTRCTCCTCPGCTGflGGRGRGGarGCflflrcTGRGCRRR SOO 
POP TOOvh GSvtSLPSSPSS rYSSRCCPnOStO 

" ~~ ock A 



rCCGGRRGC TTCG TRGGGPBCrCGPRTCBTCCCflGGRRRRBGTGGCCftCCT TGPCGTC TCRGCTT TC TGCCRBTGC TRR TC TGGT3CC "GCTTTTGRGCR 600 
\ B t I BBClCSSQCK VRTl TSOLSR MRNLvPflrCO 

GRGCCTGGrGPRrflrGRCay::;3CC:CCuRCPCCTGGCRGflGRCGGCCGRG6BGPJ»GGaCRCTGRGCTGCTGGRTTTGCGRGflPRCCPTBGRCTTTCTG 700 
nomdcqy Moc* A 

PflGRBBRflCPRCTCrGPGGC::?:GCBGTCBTrCflGGGRCCCCTTRRrGCCTCRGflRBCC«CBCCCflBRGRflCTTCGGRTCPPCSGaCPBPPCTCCTCBG 8C0 
<KKMSCa3PvlOGRLMRSCTT P « C LRIKROMSS 

fcmooqy Pact a 

RTPCCRTCTCRBGCCTCPPCSGCSTCRCrPGwCBrrCCBCCflrCGGCflCCBCCBPGGBrGCTCRTGCGflPRRBGPRGBRRPRPRBGRGTrGGGTCTflrGR 900 
OSISSLNSfTSMSS IGSS<QBOR*« K g K K S V V V C 

| mncocy oke* 9 

CCrTCGRPCTTCCrTC=flCS"C:3r7CSGrp:OBflflflRCGGGCCCPRGrCPGCrTCCrCRrPCrCGGRTHTflGflGGPGPrTGCTacaCCCCRCTCTTCR 1000 

LRssr<(<3rsiKKGP«sRS$vsoiceiprPoss 

GCCCCCTCRTCCCCCPPPC'-CaJCaTGGrTCracRGBGPCTGCTTCflCCCTCCRrCPPGrccrCCRCCrrGTCCTCCGtGCCCPXTGaTGrCRCCGBGG HOO 

RPSSPKi:nssrc r r s p s i < sstlssvg^ovtc 

GCCC TGC TCPCCCRGCCC:: :sc *-C "PCCC TG TTCCR TGCRBPrGRGGRGGBGCPGCC BGPGRRGPRGGflGG Tfl TCGGPGC TGCGC 'Z TGPGC TBTGCGB »?00 

cprhprp-tol^mrmceecp cgggvsciasci.vc 



GRPGGaflRfGRRGCrrPCPGPCS*CCGC:TGCRGGCCCTCPRCTCTGCCCBCCPRCTGGRrCRGCTTCGGGRGflCCPrCCPC«OCorGCPGrTCGRGGrG >300 
cenKLro»aLCRLHSRHOtO0LRETnH««not.cv 

GRCC TGC TGRRRGCPCPGsa *GS ::SPC 'CPRGG rRGCCCCBGGCCCC TCP fC PGGC TCC PC TCCSGGGCRGGTCCC TCCR TCPTC "3CR HR TC TTCCC ««00 
3LLtRC-:a. CVPPGPSSGS rPGOVPGSS«lSS 
"or-cogv p>geK C ^ 



c^cgccgc ^cc c TacGc: *: =■ : *;s : : : 5 • *c : rrcGCCc:: *c rc ttgcsgpcpc «g «cc re tcrcccp tggr iggcr r c *g • s c tig rGGTccRRR « $00 
oBPSLGvi. , '-*srGPSLflOT3i.spnaGis:cGPt 

GGPGGPRGrcRCCC-c::;: , :;';:-;-;:P-:c:cccGCAG:-c«»TCRTceacGCGGc:rTcssGCRGcnGGRATrcrTc:rci;: *c jpgcrrggtc <soo 

C S v T l 9 , . 5-pOO .ll I * G 0 L « OOCffv^CSKV 



jg *GGPRopcr rcRc::: ; - ;-•*.: *c i** * j^-ic 'G tt * tcc»^g *g ticrrggsc rs *a r ttc rs^arcGPeccsc:: *: *-:::*-cc»crRRGCP i'^o 

S G * v Q » « • ; f^yfCVftQv ts<"Q p BS*'.G*.S 

-offytoqv D«x« C 

: :G5crccPTCfl'G;c - -:i::i*:5:;:3CG^npcGBG'GMGGflrGCPuPGcc::::GAG5rcccrccr 'Gc:s:c;=;c"-:M f RPCRTR?c 'Jco 

* £ S I h "1 » s s^'snvLORCPPCnPOCO'S'^MlS 
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fiCTCTCCCTCflftflCGrCTCaacGSCaaflTccCTCCACACCCTCCTGTTCCPC^CCCTCfl-CCCCflOCCCCArCflTCCPCCfiCTRCflWRCCCTCCTCCTC 1400 
vst* G l. c C * C vOStvfCT*- iPtPnuOHViSttl 
I pproogy goct 6 • quo rnci«oi<q« BO 



«»CCfiCCGCCCCCTCCTCCTC:c;32CCCC«(iCGGCRCCGCC»AC»CCTfiCC'GflCCftPTCCCTTCCCCC«CTRCCTCCTCCflCCGCTCTCCCCGTCflCG 2000 
KMPPIVL5SPS6TGKTVI f N fl l A g V I V C B S 6 B C 

npmoiogy pocb E • Of C nuc»«cra« 60 

rCACAGflGGGCaTCG*;ac:^:--:SflC«TGCflCCflGC«GTCTTGCnflGCa'CTGCa»:?GTPTCrrTCCflflCCTRGCCoeCC«;procflCCCGCR«« 2«00 
VTCG»vS7 f, «WHQ0SC<0CQl. v l>Sl«L0MQ10QCT 
fyynaoqy ptcct E ceo nyc;*a c< SO 

AGGAATTG£GCArG?GCCCC TCiCTTJBT TtTBTTGGArGACCTGAGTGAB&CBGGCTCCATCRGTGflG rTGGTCARTGGCGCCC TTBCC TGCRAGTATCAT 2200 
G I G 0 V P I V IlLDQLSCflGSlSCLVMGflL TCCVM 

nome*oqy aoc« E ♦ pr to nucxoto* 60 

AAATCTCCCTAT?»nATflGGTAXCACCAA7CPGCCTGTAflAAArGACACCCMCCflTGGCTTGCACTTGfiGCrrCAGGArGnGACCTTCTCCAACARCG 2300 
<CP VI tGTTMQPVtnTPKHGLHLSfPnLTfSMIt 

romoioqy joe* g • »»o nuoaonx BO 

TGGAGCCAGCCAATGGCTTCCTGG'TCGTTACCrGAGGAGGAAGCTGG TRGBGTtflGBCflGC GflCBTCAATGCCAACAAGGAAGAGCTGC T7CGGGTGCT 2a00 
VCPA M 6 f L VAVCWAKLVCSOSO I MANKCCllAVt 

nomoioqy aoeic E • pita nua«ond» BO 

CGACTGGGTOCCCPACCTGTGGTATCATCTCCACACCnCCnGAGflAGCACaGCPCCTCAGACTTCCTCATCGGCCCTTGCTTCTTTCTGTCGTGTCCC 2500 
nomoogy aoei E ■ tuta nucHotid« BO 

ATTGGCArTGACGACTrCCGG?CCT3GrrCATTSACCTGTGGWCaP£rCTATCnrTCCCTflTCTACAGGAAGGAGCCAAGGATCGGATflARGGTCCATG 2600 

[CI c o r o r » r 101 vmwsi iPviOCGflcOGitVH 
nomqogy aoo S • cyeo nucleate* 90 

GACflGAABCC-GCr7GGSSCGSCC:5C:iGSarcCGTCCGG5SCACACTTCCCrGGCCATCflGCCCAACAAGACCAATCAflAGCTG7ACCACCTGCCCCC 2700 
GQgQavCOovCvvflOTiPvPSflOQOOSiCLVHLPP 
nomoogy bock 6 • prtfl nucieatl* 60 

ACCCRCCGTGGGCCC'CACAGCa rr 3C-7C rtCC*CCCGAGGATPGGflCRGTCflPfiGACflGCaCCCCflAGTTCTCTGGACTCBGATCCTCTGflTGGCCATG 2800 

PTVGPHS ; ^3PPC0ATVK0STPSSLOSOPLnfln 
nomotogy aacn £ • pied nuci<atQ« BO 

CTGC TGRAACT TTPRGaAGC TCCCSSC rPCAfTGRGTCTCCfiGA rCGRGBBBC CA TCC TGGACCCCAACCTTCRG^CBBCPCT nABGGG^CGCCAATC 2900 
LLCLOCAasv IESP0BCTIL0PMLOAT L F 

■•*or*oogy aoc> £ j ytd nuexoic* 30 ■ 
ACTCTCaCCCCCGGACaGCPGgoCGITGGCaTCPGCTATCTTPCCTCCTCC T CTCCCC TC TCClt rTTCPGAGC AC TCGCfCTCCBGCCCCBCGBCGAGA 3000 

BCPGGACGC0GGPCGaGarGsac:g;CJCGGacaGGrTCrTGG7GCTGraCCTTTCPGaACTTCC rPGGAACGAWTCGTCCGG TG6CG TT TGGGAAC TTG 3»00 

rGCceccrBPPcacPF~ac *aarGBC TTTGCCGaAPPGPTGa ttctggg'C tttccc r tgact tc ftg ntc bat facrapc tcctggg 32 oo 

3' ^g*n»uteo truer 

CT rTCTCGGCa:c:o: T CaGapaJCa*:3ac;6CQcrGCBGCaG7TCCTBflP TGB7 rcrcacaaGCPPCCCTGaCAGPGacaCTCrrGrCAGGGaGATCTG 3300 



GGGGa&GCBGGSaGC 7CC ?CP:a : :Z»t^ZZZTKC^Ht rcCflrcaCCPC rGCCPBC^PC TCC TCCCCCBGBGa TC 7GGCTGGBGCCCBGBAA 3«00 

opsaoGCPTGT^c :?Tsasss=*:' *• •:;3rcrGtaaapc;:aaappTsaafipflQCSPacpcaaGcapgcaBReaapQfl«CpafCGPgpaGaTGAA 3500 
GC rccacPGaG«G:^a:;sG r-:;;^:;-;;z:sGc tgcc:;: •Ccrecc: > :"Ga:oaca:*CG.;GaeaTCPac3a':se^GC rcccaacc rcaGBBG 3600 
TCBCCPPaccacsaaaJTapCfJ;-::: "zscccaaacac "CCaGCTC'G'C rr rc »aec: T C raof t rPBCPa-;caCCGGca rie*GC HGGPC 3?00 

3 J*V*rVi'*<3 '*•**> 



FT PPCC 3? 36 
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Tuesday. 16 November 1997 t3:S7 
ngS4pLMi P >82aS) SHo and S*QU«nc» 
Enzymes : 73 of 143 enjymee (Filtered) 

■ Selttngi : Circular, Certain 3<t«» Only, Standi Genetic Code 

Gr3GCACT777CGGGGAAA7G7GCGCGGAACCCCTA777GT77A7rTTTC Taj A T AC A 7 TC AAA 7a TG r A TCCGC 7CA 73 AG AC AA 7 AACCC TGAfAAAT 



Pag^ » 



— — — — — ~ ■ ■ ■ ' ■■ ■■ ■ ■■■ ■ — loo 

CACCGrGAAAAGCCCCTTTACACGCGCCTTGGGGATAAACAAATAAAAAGArTTA TG7AAG77TATaCA7AGGCGAGTaC TC TG T T A 7 T3GG AC T*A r FTA 

GGTFRGNVRGTPICLFF. I HSNMVPLMRQ . P .11 



GCrrCAATAATArTGAAAAAGGAAGASTATGAGTATTCAACATTTCCGTGrCGCCCTTATTCCCTTrTTTGCGGCATTrrGCCrT CCrGTTrTTGCTCAC 
CGAAGrTATTATAACTTTTTCCTTCrCArACTCArAAGTTGTAAAGGCACASCGGGAArAAGyGAAAAAACGCCGrAAAACGGAAGGACAAAAACGAGTG 
L Q . t . KRKSMS I 0 H F R V A L IPFFAAFCLPVFAH 



CCAGAAACGCTGGrGAAAGTAAAAGATGCrGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACA GCGGrAAGATCCTTGAGAGTT 
GGTCrrTGCGACCACTTTCATTTTCrACGACTTCrAGTCAACCCACGTGCrCACCCAArGTASCTTGACCTAGAGTTGTCGCCATTCTAGGAACTCTCAA ^ 
PETLVJCVKOAEOOLGARVGYIELOLNSGKILES 



TTCGCCCCgAAGAACGrTTTCCAATGATGAGCACTTTrAAAGrTCTGCTATGTGGCGCGGTArTATCCCGTATTGACGCCGGGCAAGAG CAACTCGGTCG 
AAGCGGGGCrTCTTGCAAAAGGTTAC TACTCGTGAAAA7TTCAAGACGATACACCGCGCCATAATAGGGCArAACTGCGGCCCG TTC TCGTrGAGCCAGC 
FRPEERFPMM5TFKVLLCGAVLSR10A6QEQLGR 



CCGCArACACTArrCTCAGAATGACr TGGTTGAGrACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCrGCC 

' ' I I I ■ ■ ll. I I It .... I I ... - 1 1 . ■ . . t t t CQQ 

GGCGrATGTGATAAGAGTCTTACTGAACCAACTCATGAGTGGTCAGTGTCTTTTCGTAGAATGCCTACCGTACTGTCATrCTCTTAArACGTCACGACGG 
B »H v SONOLVEYSPVTEKHLTOGHTVRELCSAA 



ArAACCATGAGTGArAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGA'jCTAACCGCrTTTTTGCACAACATGGGGGATCA TGTAA 
rATTGGTACrCACTATrGTGACGCCGGTTGAATGAAGACTGTTGCTAGCCrcCTGGCTTCCrCGATrGGCGAAAAAACGTGTTGTACCCCCTAGTACATT 
J T MSOMTAANLULTT IGGPKSLTaFLHNMGDMV 



CrcGCCTTGATCGrTGGGAACCGGAGCTGAATGAAGCCArACCAAACGACGAGCGTGACACCACGArGCCrGTAGCAATGGCAACA ACGTTGCGCAAACT 
GAGCGGAACrAGCAACCCrTGGCCTCGACrTACTrCGGTATGGrTrGCTGCrCGCACTGTGGrGCTACGGACATCGTTACCGTTGrTGCAACGCGTTTGA 
r P L DW VE P E L WE A f P M Q E R 0 T r * P V AX A T T L R K L 

ArTAACTGGCGAACTACrTACrCTAGCTTCCCGGCAACAATrAATAGACTGGATGGAGGCGGATAAAGT TGCAGGACCACTTCrGCGCrCGGCCCrrCCG 
riATTGACCGCTTGATGAATGAGArCGAAGGGCCG7T3rTAA:TA-CTGACCTACCTCC3CC7ATTrCAACGTCCrGCrGAAGACGCGAGCCGGGAAGGC 
L r C E LL T L A S * QO L I O VH E AP KV A GPLLRSALP 

:crj.;c7GG777A77GCTGA7AAA7crGGAGCCGGrGAGCG7GGG7C7CGCGGTA7CA77GCAGCAC7GGGGCCAGA7G GTAAGCCCrcCC3rATCGTAG 

:ii;:GACCAAA7AACGACTA7T7AGAccrcGGCCAC7C3CA:::A3AGCGCCArAG7AAC':r:GTGACcccGC7cr-kCCAT7CGGGAGGGCArAGCA7c 5 

A * tfF l *OKSGAGE^G S R G I l*ALGP03»CPS*tV 

^'"SACGGGQAGTCAGGCAAC "ATGGA7GAAC3AAA f A j ACAGA f CGC TGAGaTaIGTGCCTCAC TGAT TAA3CA 7TGGTA AC 7 j r C AGACC A 

4i-jjA:GrKTGCCCCTCAGrccGrr:A7Ac:7AC77GCTr:Ar.:73rcrA5CGAC7crA:::AcaGAcrGACTAj7 7CG7AACCAr73A:AG7CTGGT 

I ! T r g SOArHDSRH*0|AEIG*SLI<MV.LSOO 



tooo 



A3 T f f *C 7CATA TATACT 7TAGAT 7GA777AAA AC 7 7C AT 7 77 TA A ? T T AAAAGGA7C 7AGG 7GAAGA7CC 7TTTTGA7AA7C TCA TG AC CAAAA7CCCT 

aaaTQAj rATATAfGAAATC TAAC TAAA T7T TGAA «7AA AAA 7 7AAA TT T TCC TAGA 7CCACTTCTAGGAAAAAC TaT TaGAGTaC T3»3 7 77 7AGGGA ' '°° 
V y 5 T 1 t ; 0 I * l » f F K R | , V K I L F 0 N L M T * 1 P 

r^^w;r3AUTT7CG7rcCAC7GAGCG?CA5ACCCCG74GAAAAGA'CAAAGGA7C77C77GA3A7CCT7TTTTTC7GCGCGTAATC7GCTGC7TGCAAA 



7J.:ACT.:AAAAGCAAGG7GAC7CGCiG7C733GGCAr:7rrTC-A3777CC7AGAAGiAC7CrAGGAAAAAAAGACGC3CArTAGA:3A:3AACGIT7 

Q £ F s F H .asopv-< ikgss.opfflpvicccq 



1200 



;AAAAAAAC:ACC3CrACCA3CGG7GG777G77 7533 33A';aA3A3:TaC3AaC7C 7777 ";;gaaGG7aaCTGGCT7:a3CAGAGC3Ca:a7ACCaaa 
i* " *7 77-33 73GCGA rGG T CGCCACCAAAC AA ACG3 I^fA^TfC r r 3A T j3 7 7G A3 AAJAa I- T TCCA r IGACCGaaG ICG7C 7CGC 3 7ATGG r T T 
r s ; » P I P * V V C L O 0 Q ; L P T L F P < V 7 G F 5 R A ; [ p .N 

?a-. -gtcc r:;7A37G7AGCCGTA.jTTA.:3ccA:;AC7-;Ai;AA;-;-.^rA3.:A:3G:::4;A TA:c rc jC rc 7gc :aa :gtt ac:a; tggc 73c 7 

- ' W A3GAA3A 7CAC A7C3 3CA ;caa:c: 337^ 

'*' u » 7 • * «- * * ^* 7 < 3 V A 3 3 T r l * L «. I L L » »' A A 
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Tuesday. 18 Novenfcer 1997 13:57 Pa- 0 v 

fla 54 oLM l M > flg&S) Site and Segggncg 

GC^'AG TCCCGATXAGrCGTGrCTTACCCGG T TCG ACTCAAG ACGaTaGTTaCCGGA T AAGCCGCAGCGGTCGGGCTGAACGuGGCG T rCGTSCiCACAGC 

» ' ' ' ' ■ ' ' I5CC 

C<i(irCACCGCTATTCACCACAaAATGCCCCAACCTCAGrrC rCCTArCAArCCCCTATTCCGCCTCGCCAGCCCCACTTCCCCCCCAAGCACOrcraTCC 

A S G 0 K S C L r G L P S 9 R. I P 0 K A 0 g S G . T C G S C T Q 

CCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCSTGAGCTATGAGAAAGCGCCACGCTrcCCGAAGGGAGAAAGGCGGACACGTATCC 
G.-iTCGAACCrCCCTTGCrGGArGTGGCTTCACTCTATGGA:GrCGCACTCGATAC TC T7TCGCGGTGCGA AGGGC7 7CCCTC 7T TCCGCC TG TCC A7 AGG 
p 5 L Eft rTYTCUR rtORE U . ESATIPJGMM R r p 

GGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCG6GTTTCGCCACCTCTGACTT 

' ' ' ' ' ' I I > ■ I ■ I ■ I | » |7QQ 

CCATTCGCCGTCCCAGCCTTCrCCTCTCGCGTGCTCCCrCGAAGGrcCCCCTTTGCGGACCArAGAAATArCAGGACAGCCCAAAGCGGTGGAGACrGAA 
VSGftVGTGE ft T ft E L P GONAVYLYSPvGFQHL . L 

GAUCGTCGATTTTTGTGATGCTCGTCAGGGGCGCGGAGCCTArSGAAAAACCCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTrGCTGGCCTrTTG 
CFCGCAGC rAAAAACACTACGAGCAGTCCCCCCGCCTCGGATACCTrTTTGCGGrCGTTGCGCCGGAAAAArGCCAAGGACCGGAAAACGACCGGAAAAC 
E ft P. F L . C S SG GRS LVKN A S N A A F I R F I A F C V P F 

CrCACATGTTCTTTCCrGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCACCCGAACGACCGAGCG 

i ■ i ■ i ■ * ■ i ■ i 190Q 

GAGTGrACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGCATAArGGCGGAAACTCACTCGACTArGGCGAGCGGCGTCGGCTTGCTGGCrCGC 
AHMfFPALSPOS V 0 H ft (TAPE ADTAft&SftTTER 

■ 1 ™ ' * ■■ ■ * 1 ■ ■ ■ * ■ ■ I IWJ M ■ ■ « ■ ■ ■ H ■ ■ I I I I I I ill - li - ■ it I . |. t I I .... > fc 

CAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGA TTCAT rAATGCAGCTGGCACGACAGGTTT 
GrCGCrCAGrCACTCGCTCCTTCGCCTTCTCGCGGGTTATGCGTTTGGCGGAGAGGGGCGCGCAACCGGC TAAGTAAf 7ACG7CGACCGTGC TGTCCAAA 2000 
S E S V S E £ A E E ft P I ft K P P L P A ft V P IH.CSVHORF 

CCCGACTGGAAAGCCGGCAGTGAGCGCAACGCAArTAAr3T3AGT7AGCrCACTCATrAGGCACCCCAGGCTTTACACrrTATGCTTCCGGCTCGTATGT 

— — i ■ ' ' ■ t ■ t 2 ICO 

GiiUCrGACCTTTCGCCCGTCACICGCGTrGCGTTAATTACACrCAArCCAGrGAGrAArCCGrGCGGTCCGAAATGTGAAATACGAAGGCCGAGCATACA 

POVKAGSEftNAINVS LTH . A P Q A L M F M L P A R M 

TGrGrGGAAnCTGAGCGGArAACAATTTCACACAGGAAACAGCTAT-GACCATGArTACGCCAAGCGCGCAArTAACCCrCACrAAAGGGAACAAAAGCT 

— — . 1 ■ ( , , , i i 220O 

ACACACCTrAACACTCGCCTArTGTrAAAGT3rGTCCT7T3T:3A-ACTGGrACrAATGCGGTTCGCGCGTTAAT7GGGAGTGArTTCCCTTGTTTTCGA 
I CG I V SG . Q F H T GN3 Y D M 0 Y A K ft A INPM.REOlCL 

Gi.i7ACCGGGCCCCCCCTCGAGGTCGACG2rATCGATAA3C ? T3A:ATCGAATTCCTGCAGCCCCTGCTCTTCAGCCAGATGCTGGACCCAGA5TCCCAG 

' — ' — ' • ' 1 1 i ■ ■ i i 2300 

C'.CA7GGCCCGGGG3oGAGC TCCAGC FGCC A7AGCTA77CGAAC74* AGCTTAAGGACGTCGGGGACGAGAAGTCG5TC rACGACCTGGGTC TCAGGGTC 



-insert pLMl 



-Ofir pLMl 



O T G P P L £ V 0 G 10 < ■. 0 :EFL0PtLFS3.'iV0P£S0 

AOAAAGAGGACA37GCACAA7GTCCTGGA7C7CCGGCA3AA::7GlAAGAGACCATG7CCAGCCTGCGAGGGTCCCAGGrGAC rCACAGC7CCCTGGACA, 

1 1 1 i i i ■ i — ■ i » > > i i i i 2«C0 

rCT7rCTCCrGTCA;G7CTTACAGGACC7A3A3GCC3T3TT3GAC:77CTCTGG7ACAGGTCGGACGCTCCCAGGGrCCACTGAGTG7CGAGGGACCTCT 



- insert pLM 1 



a <^ yv 0«VL0 Cft3 ?iL£e T MSSLRGSOVTHSSLE 

rUAC Cr3CrAC5ACAGCGA7GATGCCAAC:CACGCAGC373"::A::C7C7CCAACCGCTCG7CCCCrcrG7CA733CGCTAT;GCCA3TCCAGrcCGCG 

1 1 i — i.i i > i > 2 SCO 

~-r "S-3ACGATGC f37C0C TAC 7ACCGT7G3 373CG7C jt-C «33 T I 33AGAGGTTGGCGAGC AGGGGACACAGf ACC3CGA 7 ACCGG 7CAGG fC AGGCGC 

— tnsenpLMi — — ^ — _ _ . 

OAPpLMl 

* r -*O300ANPR5 'SSLSNRSSPl.3VP vCOS3PR 



WO 98/24810 136 PCT/EP97/06956 
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<ioS4olMI <1>8285) Site and Sequence . . _ _ 

CC rCCACGCrGCrGACGCCCCCTCTGTCCGTCCCACC:5::::rc:GAGG5CACCCCCGCCr3GTACATGCACCGCGAACCGGCCCACrACrcCCACACC 

— _ .i i »i, — ■ i. >~--— ' ■ * ■ ■ > i i 2600 

CGACGTCCGACCACTGCCCGGGAGACACCCACCCrCGACSGCSAGCCTCCCCTGCGGGCGGACCATGrACGrGCCGCTTGCCCGGGTGATGAGGGrGrGG 

insert ptMl 

ORFptMl 

LQAGDAPSVGG S C % 5 E G TPAVfMHGElAHYSM T 



ArcCCCATGCGCAGCCCCAGCAAGCrCAGCCATArcrCCCSCCrGQAGCTGGTCGAATCCC TGGACTCGCArGAGGTGGACCTCAAGfCCGGCTACATGA 
fACGGGTACGCGrCGGGGTCGTTCGAGTCGGTATAGAGGcC 55ACCTCGACCAGC TTAGGCACCTGAGCCTACTCCACCTCCAGTTCAGGCCCATGTACT 

insert pU41 

OHP pLMl - 

MPMASPSKLSH I S?', E I V E SLOSOEVOLKSGYM 



GCUACAGTGACC rCATGGGCAAGACCATGACGGAGGAr^A'^CATCAC TACCGGCTGGGA rGAAACCAGCTCCATCAG tAGTGGAC TCAGCGA TGCCTC 

» i ' 1 1 ' ' < 2800 

CGCTGTCACTGGAGrACCCGTTCTGGTACTGCCTCCTACTACrGTAGTGATGGCCGACCCTACTTTCGTCGAGGTAGTCATCACCTGAGTCGCTACGGAG 



jnsert pLMl 



OAF pLMl 

SO SOU MG KTM TEOCO 1 T T G V 0 £ S S 3 I S 5 G L S D A S 

AGACAATCTCAGrrCAGAAGAATTCAATGCCAGCTCCTCAC 7CAAC rCCCTCCCAAGTACTCCCAC TGCTTC TCGCAGGAAC TCAACAATAGTGCTACGC 
TC TGTTAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGT^AGTTGAGGGAGGGTTCArGAGGGTGACGAAGAGCGTCCrTGAGrTGrTATCACGATGC G 

~ insert pLM i 



ORF pLMl ■ 

0 H l S S E £ f H A SSS.NSLPSTPTAS«aNSTlVL« 

ACAGAC7CAGAGAAGCGCTCACTGGCAGAAAG7CGGCTGA::-GGrrrAGrGAArCAGAGGAGAAAGCCCCTAAAAAACTGGAGTACGACAGTGGrAGCC 
TGTCrGAGrCTCrTCGCGAGTGACCGTCTTTCACCCGA-Tt^CCAAATCACTTAGTCrCCTCTTTCGGGGATTTrTTGACCTCATGCTGTCACCArCGG 

insert pLMl « 

— O&FpLMI ■ 

IOS £KftSlA£ SGL : V r SCSEEKAPKKL.E TDSGS 

rjAAjATGSAACCrGGGACTrcrAAGTGGC3GAGG0^::3:::TgA;aGCr5TGATGArTCA7CCAAGGGrGG AGAAC TG AAAAAGCCC ATC AGCC TGGG 

ac r :c tacc t tggaccc tgaagat tcaccgcctcc: t ;;ac r : tcgacac r ac rAAGTAGGirccCACCTcr tgac tttttcgggtag tcggaccc 



insert pl.Ul 



ORFpLMI 

L< W-» Gr S<wB3£S3£S C OOSSKGGELXKP ( SLG 

C.-i'JCCTGGrrcCCTGAAGAAGGGCaAGACCCCACC'S'^ZT'S'AACTTCCCCCATCACrCACA CAGCCCAGA GrGCCC TCAAA G TC GC AG GC AAACC T 

*j jjVjGaccaagggac r tct recce ttc 73ggo7-jGa;a; i - t g^aggggg tagtgag rGrcrcGGGrcTCACGGGAGT r fcagcgtccgt ttgga 



insert pLMl 





**'SSLK<GKr»»7 


i '« 7 $ P I rH7AQ5AL<VAGKP 


;ijG;C^AAGCTACAG ACAA^GG7AAGCT7GCJkG7GAijii 


■i:-:::crc:AACGCTccrccrcrGArGcrGGTCGGGACCGCC7GAGrGArGcrAAGA 


Cr*.*Tf5f TfCCA ?GTC "GfTCCCAT fCGAACGTCACT "1 7" 


:-:A::;:AiGrrGCGAGGAGGAGAcrACGACCAr,c;c7GGC-.GAcrcAcrACGArrc7 





.nsert pl.Mi 



i<A ID<G< - A v < *• ' :_0aS35DAC9C3LS?AK 



WO 98/24810 137 PCTYEP97/06956 



Tuesday. 18 November 1997 1 3:57 

lio S4 ptM > (1 > 82&S) Sile and Secuenc, 



AGccccccrcccccATTccTCGccccrccACTTcaGG4:::rrcsjC tacaagaagcc tcctcctgccacaggcacagccactg tc a r gc aa ac tgg re g 

1 1 1 1 1 ■ i ■ i ! i ... > , 3uco 

rCGGGGGGAGCCCGTAACGA6CGGGGAGGrGAAGCCCr*GGAA5CCGArGrTCTTCGGAGGAGGACGG TG TCCG 7G fCGG TGAC AGTACG 77 TGACCACC 



-rawt pLMl 



-ORF pLMl 



<ypSG IA RPSTSJ$ f ^TKICPPPATGrATVflOTGG 

rrCAGCCACrcrCAGCAAGArCCAGAAGTCCTCAGGCA7C:CT37CAAGCCAGTAAATCGGCGCAAGACTAGCTT AGArG TTrcCAACAGCGCACAGCCA 
AAGTCGGTGAGAGrCGTrCTACGTCrTCAGGAGrcC3r*GGGACAGTTCGGTCArTTACCCGCGTTCTGArCGAArCTACAAAGGTTGTCGCGTCrCGGT 35 °° 



-insert pLMl 



-ORF pLMl 



5 A r L S K j O K S S G I ? V K P V N GRKTSLDVSNSAEP 

GuATTCC7GGCTCCTGGAGCCCGTTCTAACA TCCAGTAC:oCA5C:TGCCCCGGCCAGCCAAGTCAAGTTCrATGAGCGTGACCGGCGGGCGGGGTGGAC 

~ " * ■■»■.«■»■! i i i I, | i — . * . . . . . . . t . | 26CX) 

CCTAAGGACCGAGGACCrCGGGCAAGATTGTAGGTCATGSCGrCGGACGGGGCCGGTCGGTTCAGTrCAAGATACrCGCACTGGCCGCCCGCCCCACCTG 



-insert pLMl 



-ORF pLMl 



G F L A P G A R S N t 0 T R SLEEP A K S S S H S V T G G R G G 

CrcGCCCTGrGAGCAGCAGCArrGACCCCAGTCTCCTCASCACCAAGCACGGACGCCTTACGCCrT CCAGACrGAAGGAGCCTACCAAGGTAGCCAQTGG 
GAGCGGGACACTCGTCGrCGrAACrGGGGTCAGAGGAGTCGrGGTrCGTCCCTCCGGAATGCGGAAGGTCTGACTTCCrCGGATGGTTCCArCGGTCACC 3?C ° 



-insert pLMl 



-ORF pLMl 



a R P V S S S I 0 P 5 L L S T < Q G G L T P S R L K g P T K V A S G 

■KGGACCACTCCAGCCCCTGrCAArCAGACAGArCGGuAAAAS^AJAAGGCCAAAGCCAAGGCAGTGGCCrT GGAC TCAGACAACATCTCCT TGAAGAGT 
CJCCrGGTGAGGrCGGGGACAGrTAGTCrGrCTAGC::T-r::?::rcCGGTTTCGGTTCCGTCACCGGAACCTGAGTCTGrTGTAGAGGA4CTTCTCA 



3600 



-insert pLMl 



-OriF pLMl 



C rr ?*P VN0rD3; <£< AKAKAVALOSONISLKS 

A - ru;C7CCCCA^AGAGr^CTCC CAAGAA:;AAGCAAJC:A::::A:AGCCACCAAGCrGGCAGAGCTGCCACCAACCCCTCTCAGGGCCAC 4GCGAAGA 
■ --^C^AG0-3G7CrC7CATGAGGGTrCT7^TTCJ7T::3';:s;';rCGG TGG r TCGACCG TC rCGACGGTGG TTCGGG AG AG rCCCGG rs tcgc r tc t 



-insert pLMl 



• ORF pLMl 



1 ^ 3 ? £ S 7P KNQAS-*? 7AT KL AEL P P TPLRATAIC 

j:* VQTCAAA;cAcccrcAcrAGCCAA7:T7GACAA::-:^A:-;:AACAGTcrGGArcTAc:ArcATccAGTGArACCACccATGcrTCAAACGrc cc 
;i.iAACAGr7:GGrGG'5Aj:GArcGGr7A3AAcrG7:::i:--sA:jrTG 7c agacc t aga tgg tagt agg tcac r a tgg tggg r acg aag t r rcc aggg ttCC ° 



- insert pLM 1 



-ORF pLMl 



J [ 7 \ 9 > S L A 'I L 0 K 4 N S M 5 L0LPSSS0TTHA3KVP 

rccAr^c r ac a agc rc agc Arc TCiCGcc ::t 7 tcaccccc ag rccoGC accc a rcc re aa ta r r aac tcagccajCttc rcc 

'.•••:-iAC3-A:GA7G:rcGA;fcGrAGAc::;::GGA-^:::xA::t:^AA3:GGGG0 tcag :ccg tgcg r agg ag r r a r aa r tgag tcgg * : : aag agg 



CHFpLM! — 

* * 3 ' ' 9 - ' i t ' r?S?A3|L*l|NSA3?5 



UICO 



Pag- <i 



WO 98/24810 1 38 PCT/EP97/06956 



Tuesday. 18 November 1997 1 3:57 Pag> . % 

flq54pLMl (1> 82651 Srte and Sequence 

CACCCCCTGGAGCrAArGJCrCCTTTCAGTGTGCCAJkAACJ-kGACCCGCATGTACCCCAAACTC TCAGGCC rGCACAGGAGCATGGAGf CCC rcCAGATGC 

* ■ QZOO 

GrcCCGGACCTCGArrACTCACCAAAGTCACACGGTTrrCTCrGGGCGTACATGGGGrTTGAGAGTCCGGACGTGrCCrCGTACCTCAGGGAGGrcrACG 



-insert pLMI 



-obf plmi 



J 3 I E I H S G F S V ? < £ T a M T P K L 5 G X. H R 5 H E S i 0 M 

CAATGAGCCTCCCCAGrGCCTrCCCCAGCAGTACTCCCGTCCCCACCCCACCTGCTCCCCCTGCTGCTCCCACAGAAGAAGAGACGGAAGAGCrGACTTG 

■ ■ ■■ 1 ' ■ — ■ *■ ■' .... t ■ ,>■■,. ,.i ■ ,, . . >. i ,| q3QO 

GrTACTCGGAGGGGTCACGGAAGGGGTCGTCATGAGGGCAGGGGTGGGGTGGACGAGGGGGACGACGAGGGrGTCTTCTTCTCrGCCrTCrCGACrGAAC 



-insert pLMI 



-OflF pLMJ 



PMSLPSAFPSSTPVPTPPAPPAAPTEEETEELTV 
G AG TGGAAGCCCC AGAGC TGGGCAAC TGGACAG TAA7CAGC GGG A TCGGAACAC TC T TCCCAAGAA AGGGC TCAGG TACCAGC T TC AG TCCC AGGAGGAG 

' 1 1 ' 1 « 1 1 ' i ««oo 

CrCACCTTCGGGGTCTCGACCCGTTGACCTGTCArTAGrCGCCCTAGCCTrGTGAGAAGGGTrCTTTCCCGAGTCCATGCTCGAAGTCAGGGrCCTCCTC 



-insert pLMI 



-OflF pLMI 



S G S P R A G Q L 0 S N 0 R 0 R N T L P K K G L ft YOL O S 0 E E 

ACCAAGGAGAGGCGACArTCCCATACCATTGGTGGGCTGCCTGAATCCGATGACCAGTCAGAGCTGCCTTCrCCCCCTGCACTrcCCATGTCTCTGAGTG 

* • >< tiii.i. i. ., ,i i , (,,. !, , . tl . , , „ ..,». , ,, t (j5oo 

rGUTTCCTCrCCGCTGTAAGGGTArGGTAACCACCCGACGGACTTAGGCTACTGG TCAG TCTCGACGGAAGAGGGGGACGTGAAGGGTACAGAGACTCAC 



-insert pLMI 



-OflF pLMI 



7 < E a a h s h r igglpe s d o q s e l psppalphsus 

CAAAGGGCCAACTrACCAACArAGTGAGTCCCACr3C2GCC ACC ACGCCAAGAATCACCCGCTCCAACAGCArCCCCACCCACGAGGCGGCC TTCGAGCT 

' ' l.| I ■ ■ i i i | qqoo 

G" "TCCCGGfTGAATGGrTGrA fCAC fCAGGGTGACGCCSG 753" 3CGGTTCTTAGTGGGCGAGGTTGTCGTAGGGGTGGGTGC TCCGCCGGAAGC r CG A 



-insert pLMI 



-OBFpUdl 



A < G Q L r.N I V 5 P T ■* A T tp r t T rsnsiptmcaapel 

GTaCAGCGGC rCCCA AArGGGGAGCACCCrGTCCCTGGCCGAGASArcCAAGGGAATGATTCGGTCAGGArcCTTCCGAGACCCCACGG^CGATGrTCAC 
■. "TGrCGCCGAGGGrTrACCCCICGrGGGACAGGGACtGGC^TC'jGGTrCCCrTACrAAGCCAGTCCTAGGAAGGC "C TGGGGTGCCTGC TACAAGTG 



• OOP pLMI 



' SG SQM GSTLSLA iJJ K GJi 1RSGSFR0P r 0 0 V H 

J'i'J r C AGTG C r G TCCC TGGCCrcCAGIGC:rCCrCCA::TA::;:-:AGCrGAGGAGAGGATGCAATCTGAGCAAAT:CGGAAGCTTCGrA:GGAACTGG 

1 " ■ .... i . > . . . . i — ....... . . l t U3CQ 

C:JA3 TCACGAC AGGGACCGGaGG'C ACGG AGGAGG* j jAG3 - j fCGACTCC TC TCCTACGTT4GAC rCGTTTAGGCCTTCGAAGCArccCTrGACC 



• .nsert pLM 1 



- ORF pLMI 



*g atcc c -oGaaaaag fgiccacc t ? gacg r: :■: * ; : :: • : r: : : aa 7gc taa tc tggtggctgc ttt tgagcagag-c tgs re aa t a r s «c a rc ccg 

1 — — — ► uKO 

-y. ■ i r. rc a; ^GC tOG^-**! ^GCAjAG*; saa a 3a; • ; T ? ACGAT f agaCC ACCGACGAAA AC TCG TC ' r . IGaCC AC f * a f aI rGFAGGGC 

— — ^ _ _ r.sertptMl — 



ZHFolMt _ 

i 'S :-'lANLVAAF£0: . * N *• T 5 R 



WO 98/24810 1 39 PCT/EP97/06956 



Tuesday. 18 November 1997 t2,S7 

Hq 54 PCM l f 1 > ggga Site and Seguewce 



- mtert pLM 1 



-OHFpLMl 



tnMLACrAEeKOTiLLOLREriOPLKKKNSEAQ 



GCAGrCATrCAGGG4GCCCTTAATCCCTCAGAAACCACA:::AAAGAACrrCGGATCAAGAGACAAAACTCCrCAGATAGCATC TCAAGCC TCA ACAGCA 
CGTCftGTAAGTCCCrCGGGAATT^CGGAGTCTTrGGTGrGGSrrTCTTGAAGCCrAGTTCTCrGTTTTGAGGAGTCTATCqTAGAGrTCGGAGTTGrCGT 5, °° 



-insert pLMI 



-OHF pLMI 



■IV I OG AC.NAS E T T 3< £ L 8 1 K RONSSDSISSLNS 

rCAC rAGCCATTCCAGCATCGGCAGCACCAAGGATGCTGA-;;3AAAAAGAAGAAAAAAAAGAGr TGGGTCrATGAGCTTCGAAGTTCCTTCAACAAAGC 
AUrGATCGGrAAGGTCGrAGCCGTCGTCGrTCCTACGACTACSCTrTTTCrTCTTTTTTTTCTCAACCCAGATACrCGAAGCTrCAAGGAAGTTGrTTCG 



— insert pLMI 



-ORFpCMJ 



1 T 5 H S S 1 G 3 S K 0 A 0 A < KK K K K S V V tEL»SSFNKA 

GrTCAGTArAAAAAAGGGGCCCAAGrCAGCTTCCTCArACTC3GATATAGAGGAGATTGCTACACCCGACTCTTCAGCC CCCTCATCCCCCAAACTACAG 
CAAOrCATATTTTfTCCCCGGGTTCAGTCGAAGGAGTATGA-jCCTATATC TCC TC TA AC GA TG TGGGC TGAGAAG TCGGGGG AG TAGGGGG TTTGAfGTC 5000 



-insert pUMl 



-OBF pLMI 



FS tKKGPKSASST SOI £g IATPQSSAPSSP< L 0 

CArGGrTCCACAGAGACrGCrrCACCCTCCATCAAGTCCrClArrrTGTCCTCCGTGGGCACrGATGTCACCGAGGG CCCTGCTCACCCAGCCCCCCACA 
3:ACCAAGGrGTCrci3ACGAAGTGG5AG3TAGrrCAJGA:r:3AACAGGAGGCACCCGTGACrACAGTGGCTCCCGGGACGAGrGGGrCGGGGGGTGT 



— msertplMI 



-OftF ptMl 



H g S T E rAS;> S ; < S S * L SS V G fQ VrCGPAHPAPH 
:*^^0CTGrrCCArGCAAATaA2GAGGAGG^:::AGAQAA:iA23A:GTArCGGAGCTGCGCTCTGAGCT 4rGGGAGAAGGAAATGAAGCrTACAGACAr 

S4?r;cGACA^GGrACGrrrACTccrccTc:r:GGrcT:rT:'"::r;cArAGCcrcGACGCGA5AcrcGArAcccrcTrccTTrACTTCoAArGKTGTA 5500 



-insert pLMI 



-OflFplMl 



r * - F H A * E g £ E * £ * ' S V S E L R S E L V E < £ H K L f 0 1 

; : zz r rss agcccc re aactctscccaccaac r jsa: :as: * • : : : sagacca tgcacaaca tgcag r tgo agg tggacc tgc tgaaagcag acaatgac 

; i-:3AAc:rccGjGAG:rGAGACG'3GTGC r T;A:cTA3:;;i^:;;:rc togtacgtct igtacgtcaacc rccACCTGGACGACTrrcGTc tcttactg 5 °°° 



•nsert pLMY 



-OHF pLMI 



I L e 4L? «SAH0.C-. = £rMMMM0L£V0LLlCA£N0 



■v ' ;aa yj rA.j.;c::A3G Cccc tcatcag:: :cca: t::^ ::: : igc rccc tsgatc atc tgca t t atc t tccccacgccgc tccc taggcc tcgcac 

- ^77777~7~ ** — — . ... it, . . , . ... . ( 5700 

■ j..-. . i . A 1%. «CG j rc; jGGGA j fA»j 7^Z3A3G T jA^^ TIT I *CCA jCGaCC TAG T ACACCTAA f a 3AAGCGGTGCGCCGAGG0ATCC ~SACCGTG 



• nsert pLM i 



Pagr 



CC TGCGACACCTGGCAGAGACGGCC GAGGAGAAGGAC AC TGAjC TGC TGGA TT TGCG AG AA AC CAT AGAC TT TC TGAAGAAA AAGAAC TC TGAGGCCC AG 

' 1 " '* 1 ' 11 ' ■ ■ » ■—■ — i n »- ■ i t 5000 

GGACGCTGTGGACCGTCrCTGCCGGCTCCTCTTCCTGTGACr:GACGACCTAAACGCTCrTTGGTArCTCAAAGACTTCrrrTTCTrGAGAC TCCGGGTC 



" ' — Z*\f uLM I — 



WO 98/24810 1 40 PCTYEP97/06956 



Tuesday. 18 November 1997 1 3:S7 D , a 

KqSapLMi fl>g2o5) Site and Sequence Kage ' 

rcAcccArTccTrccccccCAGrcrrG CACACACAG^ccT3r:Ac:cArGGArGGCATCAG 7act7G730 rccAAAGCAi5.»AG rGAcccrccGGGrGcr 

' ' ' ' 1 ii m I I i iii ■ , i i ■ ■ i I i ■ , f . i i| SfiOO 

AGrGGGTAAGGAAGCCGGGGTCAGAACGrCTGTGTCTGGACAGTGGGTACCTACCGTAGTCArGAACACSAGGTTTCCTCCTrCACrGGGAGGCCCACCA 



-meft pLMl 



-ORF pLMl 



L rMSFGPsuAoroLS^woG i st c : p k £ r v r l r v v 

GG7GAGGATGCCCCCGCAGCACATCArCAAAGGGGACTT3A43CAGCAGGAATrC rTCCTGGGCTGTAGCAAG GTCAGTGGAAAAGTTGACrGGAAGATG 
CCACrccrACGGGGGCGTCGTCTAGTAGTrTCCCCTGAACTTCGTCGTCCrTAAGAAGGACCCGACATCG TfCCAG TCACC TTTTCAACTGACCTTC TAC 59 °° 



-insert pLMl 



-ORF pLMl 



v RH PPQ H I I K G 0 I < QQ EF F I G C S X V S G K V 0 V K M 

CTUU ArGAAGCTGTrTrcCAAGyGTTCAAGflAC FATA T7 FC "AAAAFGGACCCAGCC TC TACCCFGGGAC fAAGCACTOAjTCCA TCCATGGCTACAGCA 
GACCTACTrCGACAAAAGGTTCACAAGTTCCTGATATAAAG^rrTrACCTGGGTCGGAGATGGGACCC FGATFCGTGAC TCACG TAGG FACCGATGFCG7 



6COO 



- insert pLMl 



-ORFpLMl 



L0£Av rQVF ( { 0Yt 5<?10PAS Tl.G LST£S lH GY S 

TCAGCCACGTGAAACGAGTGrrGGArGCAGAGCCCCCCGAGATGCCrCCTrGCCGTCGAGGTGTCAArAACATATCA GTC TCCC TCAAAGGTCTGAAGGA 
AGrCGGTGCACTrTGCrCACAACCTACGTC fCGGGGGGC T: "ACGGAGGAACGGC AGCTCCACAGTFAFTG7AFAG FCAGAGGGAG7FFCCAGAC TFCCT 



6100 



-insert pLMl 



-ORF pU4l 



I 3wv< av_tOAEP_?e«PPC R R G V N N ( SVSLKGLKC 

GAAArgCGrCGACAGCC7GGrGTTC5AGACGCTGA7C:::^i:::3ATGArGCAGCACrACArAAGCCr::7GC TGAAGCACCGGCGCCTCGTCCTCTCG 
CrTTACGCAGCTGTCGGACCACAAqcTCrjCGACrAS^G'-^GCTACTACGTCGTGATGTArTCGGAGaACGACTTCGnOCCGCCGAGCAGGAGAGC 



-insert pLMl 



-ORF pLMl 



V C * 0 S L v F S [ t I a < P M * Q h r | $ t L < h 3 R L y L S 
CCAGCGGCACGGGCaaGaCC TAtC TQACCAA7C3C T* jASFACCFGG TGGAGCGC TCTGGCC j T ^AGGTCACA3A-jGGCArCGrC AGCACC T 

y'- ; *'"*rcGCCGF:;cc^^ 



53CO 



- :nsert pLM 1 



-ORF pLMl 



j P S C r 5 < T * L r N 3 . a r y L v £ * S G * f. Y F £ G I V S F 

••--^GCAccAGCAGr:7::cAAGGA7CF:;:AA:7:rAV — 

^ :: ;' A C*rSGrcG7CAGAACG77-:c7A;A:3 77^:^7^ ,!iC ° 



• :nsert pLM l 



N M H 0 g s C < Q L ) L • . 3 U LANOIO^STGIGOVPLV 

'-"•* A ' r ^r*ACcrax;rsAA;c*c;c-::A7^ 

— ^— — — — — — .r.iert pLM l — 

~" ■ — Z»? pLMl ■ 



WO 98/24810 1 4 1 PCT/EP97/06956 



Tuesday. 18 November 199713:57 . 
Kg 54 pLMl (1 > egaSJ Site and Sequence Pa 3* V 

ACCAArCACCCTGTAAAAATCACACCCAACCATGCCTrSCACrTSASCTTCACGATGTrGACCrTCTCCAA CAACG TG GAGCCAGCCAAr GGCTTCC fGG 

Tum r fAGTCGGAC AT T IT TAC TGTGGQT TGGTACCGA ACSTGAAC TCGAAG TCC fACAAC TGG AAGAGGT 7G TTGC ACC fCGG TCGG T TACCGAAGG ACC 6600 



-eisertpLMI 



-OflF pLMl 



r N Q P V K M T P M H Q L j< I S F fl fl L f F S H N V E P A N G F L 

rrCGrTACCrGAGGAGGAAGCTCGTAGAGrCAGACAGCGACATCAArGCCAACAAGGAAGAGCTGCrrC GGG TGC TCGAC fGGG T ACCCAAGCTGTGGTA 
AA'aCAATOGACTCCTCCfTCGACCArCTCAGTCrGTCCCrSTAGT rACCGTTGTTCCTTCTCGACGAAGCCCACGAGC TGACCCATGGGTTC6ACACCAT 97 °° 



- insert pLMl 



-ORFpLMI 



y**LR RK LVE S0 SO I N A N K £ ELLRVLOVVPKLVV 

TV47CTCCACACCrTCCTTGAGAAGCACAGCACCrCA3A;7TCCTCATCGGCCCrTGCrTCTrTCTGTCGTGTCCCATTGGCATTGAGGACTTCCGGAC C 
AUTAGAGGrGTGGAAGGAACrCTTCGTGTCGTGGAGTCrSAAGGAGTAGCCGGGAACGAAGAAAGACAGCACAGGGTAACCGTAACTCCTGAAGGCCrGG 



" insert pLMl 



-ORFplMl 



M t H T F I E K H S T S Q F L I 0 P C F F I S C P I G I £ 0 F R T 

TGGTTCATTGACCrGTGGAACAACTCTATCATTCCCTATCTACAGGAAGGAGCCAAGGATGGGArAAAGGrcCArGGACAGAA AGCTCCTTGGGAGGACC 
ACCAAGTAACTGGACACCTTGTTGAGATAGTAAGGGArAGArGTCCrTCCTCGGrTCCTACCCTATrTCCAGGTACCTGrCTTTCGACGAACCCTCCTGG 



6900 



-insert pLMl 



-ORF pUII 



vFl °l v NNSt lPY> 3SGAKQ G1IC VHGQKA A V E 0 

CAG7GGAArGGGTCCGGGACACACTrCCCrGGCCATCA2:::AACAAGACCAArCAAAGCTGrACCACCTGCCCCCACCCAC CGTGGGCCCrCACAGCAT 
•:r:A:crTACCCAGGCCCTGrGrGAAGGGACCGGTAG7r5G:rT3 7r:TGGTTAGrTrCGACATGGTGGACGGGGGTGGGTGGCACCCGGGAGTGrCGTA 



7000 



- insert pLM l 



-OhF pLMl 



» V £ v V R 0 T L P VPS a >: Q D O S KL r H LPPPTVGPMSI 

rilCrCACCrCCCGAGGATAGGACAGTCAAAGACAGCA::::AAg'7CTCrGGAC TCAG ATCC TCTGATGGCCATGCTGC TGAAACTTCAAGAA GC rccc 
V.'UCiA-l TGQAGGGCTCC TATCC TG T CAGTTfCTCrC3r j3C jTCAAGAGACCTGAGrC TAGGAGACTACCGGTACCACGAC TrtGAAGfTC TTCGACGG 7 ' C ° 



- insert pLM i 



-ORF pLMl 



ASP?eo « T V*PS T PSSLOSOP LM AfiLLiCLO S 4 A 
a.'.;' r AC a 7 7^3 AG fCTCCAGA fCGAGAAACCATCrTGGACtt I r rCAjSCAACAC T TTAAGGGf TCSGCaa TCAC TG fCACCCCCGGAC AGCAGAAC 

r^jAr::AAcr:AGAGGrcrAGCTcrTTo:rAG:ACc:^::rr2:^^GrccGTrGTGAAArrcccAAGCc--: r r ag fgac ac rGGGcccc r-: tcgtc r tg ' 200 



-insert pLMl 



-OfiFpLMI 



Sl * > £ S ? D R g f I L 0 ? N . 3 A T L G F G H H C * P R T A E 

'• T: "^c*ocrAf;rrAGcrccTCCTC^ _ 

■' *a:.* *-j r j j TCGA TACAA rC5AGGAGGAGA3CG ^AGA ZZ~Z-z-Z~ ' 1 **< *5*CCGAGAGG TCGGGG r CC TCC fC f fG TCC fCCC TC T f CC "C TAC f TTC 



- insert pLM 1 



l-LLS*LL" ) 5 f G S P A P G E Q £ g 3 3 £ R 



WO 98/24810 PCT/EP97/06956 

142 



Tuesday. 18 November 1997 13:57 - 
fiqS4ptMi n>8aa5) Site end Soqueocq _ ? *Q* ' 

A'iUAGGCACAGCTrCTTGGTGCTGTACCTTTGAGAAC ITCC TAGGAAGGAATGGrGGGG rGGCGTTJGGGAACTTG TGCCCCCTAAACACAfTTA CTGGC 
rCCrCCCTGrCCAAGAACCACGACArGGAAACTCrTCAAGGATCCTTCCTrACCACCCCACCGCAAACCCrrGAACACGGGGGATTTGTGTAAATGACCG ^ 



-insert pTKfi 



♦iGTGSVCC TPEHf LGa MG GV A PGNLCPLN X F T Q 

Cr:crCTAArGACrTTGGGGAAAAGATGArTCTGGGT:rrrCCCTrGACTrCTTGrTTCAArTACAAACrCCTGGGCTr TCTGGGGAGGGGTTCAGAAAA 
GAGGACAT TACTGAAACCCC TTTTCTAC TAAGACCCAlAAAGGGAACTGAAGAACAAAGTTAArGTTTGAGGACCCGAAAGACCCCTCCCCAAGTCrTTT 7500 



-insert pLMI 



11 . . U G K 0 0 S G S F P . L L V S I TNSVAFVGGVOK 

CATCAAAACACTGCAGCAGTrCCCCGGAATTCAGCTTSSACrTAACCAGGCfGAACTTGCTCAAAAGAAGCCGAArTCCAGCACACTGGCGGCCGTTACT 

" 1 ' 11 ' "■" ■ " * 1 ■ i i i i j . . . . t - . t | f 7600 

♦3rA5TTTTGrGACGrCGTCAAGGG6CCTTAAGTCGAAC;TGAArTGGTCCGACTTGAAC6AGrTTTCTTCGGCTTAAGGTCGrGTGACCGCCGGCAA TG A 



-insert pLM 1 



rCAAGATCTCGCCGGCGGTGGCGCCACCTCGAGGTTAAGCGGGAT ArCACTCAGCATAATGCGCGCGAGTGACCGGCAGCAAAATGTTGCAGCACrGACC 



r s k m c ssspef slol trl nl lkrsr ipahvrpll 

AGTTCTAGAGCGGCCGCCACCGCGGTGGAGCTCCAAT'CGCCCTArAGTGAGTCGTATTA CG CGCGCTCAC TGGCCGTCG TTTTAC AACGTCGTCACTGG 

TCAAi 

v u z a p p p a v s s n s p t s £ s y r a r s l a vvlorrov 
gaaaaccctggcgttacccaacttaatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaagaggcccgcaccg atcgcccttcccaac 

CrTrrGGGACCGCAATGGGirGAATrAGCGGAACGTC^rGTAGGCGGAAAGCGGTCGACCCCATTATCGCTrCTCCGGGCGTGGCTAGCGGGAAGGGTTG 7800 
£ N P G V T Q I N R I A A M P P F A S V R N S CSARTORPSQ 

A^TTGCGCAGCCTGAA rGGCCAATCGGACGCGCCCTGTASC jGCGCATTAAGCGCGGCGGGTG FGG TGGITACGCGCAGCGTGACCGC TACACTTGCCAG 
rCAACGCGrCGGACTTACCGCTTACCCTGCGCGGGACA-rSCCGCGrAATTCGCGCCGCCCACACCACCAArGCGCGTCGCACrGGCGATGrGAACGGTC 79 °° 
OUR SL NG EVOAP C 3G A i S a A GV VVTRS V T A T t A S 

CGCCCTAGCGCCCGCTCCTTrCGCTTTCTTCCCTrc:::T:r:5CCACGTrCGCCGGCTTTCCCCGrCAAGCrCTAAATCGGG5GCTCCCTrTAGGG TTC 
•jI j-jGArCGCGGGCGAGGAAASCGAAAGAAGGGAAGGAA-jASCSGrGCAAGCGGCC^ 8000 
A LA PAPrAPFPS-LAT FaCFPROALNRGLPLGF 

:i^7rTAGrGCTrrACGGCACCrCGACCCCAAAAAaC^-^:rA3^GTGATGGTTCACGTAG7GGGCCATCGCCCTaArAGAC-3GTTr TTCGCCCrTTGA 
^.■TA4ATCACGAAArGCC3T5GAGCrGGGGTTTTr7G4>i::AArc;CACTACCAAGTGCATCACCCGGrAGCGCGACTArC7GCCAAAAAGCGGGAAACT Q,C ° 
3F SAL RHL OPK<L G06SR SGP SP. . r V F R P L 

•I i7 7GGAG 7CCACG rrcrT"AArAGT"GGAC yC7TGTT::^A:rGGAACAACACrCAACCCTArCTCGGTCrATTCTTTTGATTTATAAGGuArTrTGCC 
GCA*CCrCAGGTGCAAGAAArrArCACCTGAGAACAA3 377TGACCrTGTrGTGAGTTGGGATAGAGCCAGArAAGAAAACTAAATArrCCCrAAAACGG 
f L £ 5 r F F H S GLLFQT^rTLNPISVVSFOL .GILP 



;^7rrcGGCCTArrGG7rAA&AAArGAQc tgatttaa^^aa— r*AcicGAArrrrAACAAA ArATTAACGcrTACAATrTac 
';-A4AGccG3ATAAccAA7Trr7TA:rcoAC7AAA:-r-T7:^ArrGCGcrrAAAATTGfnrATAArrGC-3AAr;rTAAA7c 

I S A Y W 1, K M £LI . :<?NANFNKILTLri 



WO 98/24810 1 43 PCT7EP97/06956 



Tuesday. 18 November 1997 1 1 ;4a 
fig 34 pLM4 (1 > 10070) Site and Sequence 
Enzymes : 100 cM 46 enzymes (Filtered) 

Settings: Linear, Certain Sites Only. Standard Genetic Code 



Page / 



TAG TTAT TAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCA TATA TGGAGTTCCGCG fTACATAAC TTACGG TAAArGGCCCGCCTGQC TGACC3 
ATCAArAArTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGrATATACCTCAAGGCGCAATGTATTGAATGCCAr TrACCGGGCGGACCGACTGG-- 

' H l tju : 



LLtVlNrGVISS 



P I YGVPRY I TYGKVPAWLT 



CCCAACGACCCCCGCCCATTGACGTCAATAArGACGTATGTTCCCATAGTAAC6CCAATAGGGACrTTCCArTGAC GTCAATGGG TGGAGTATTTACGGT 
GGGTTGCTGGGGGCGGGTAACTGCAGTrATTACTGCATACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGC AGTTACCCACCTCATAAATGCCA 

COT 

A Q R P P P I QVN NOVCSHS N ANROFPL T5MGGVFTV 

AAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGAC GGTAAATGGCCCGCCTGGCATrArGCCCAGTA 
TfTGACCCGTGAACCGTCA TG TAGTTCACATAGTaTaCGGTTCATGCGGGGGATAACTGCAGTTAC TGCCATTTACCGGGCGGACCGTAATACGGGTCAT 



-pOM 



NCPLG5T3SVSYAKYARY 



R Q 



MARLALCPV 



CATGACCTTATGGGACTTTCCTACrTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTG GCAGTACATCAATGGGCGTGGA 
GTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCC AA AAC CG TC A TG TAG T T AC CC GC ACC T 



HQL MGLSYUA VHL R! SHR Y V M G 0 A V L A V H 0 V A V 

tagcggtttgac tcacggggatttcc aagtctcc ACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGAC TTTCCAAAATGTCGTA 
ATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCrGAAAGGTTTTACAGCAT 



r a v 



L T G , , SKS?3H.RQVErVLAPKSTGLSKMS 



ACAACrCCGCCCCATrGACGCAAATGGGCGG7AGGC572TACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGT GAACCGTCAGATCCGC TAGCGC TA 
7G t TGAGGCGGGGTAACTGCGT7TACCC3CCATCCG" AC ATGCCACCCTCCAGATATATTCGTCTCGACCAAATCACTTGGCAGTCTAGGCGATCGCGAT 



-pCMV 



0 L R P IDANGR 



- C TVGGLYKQSVFSCPSOPLAL 



CCGGrCGCCACCArGGrGAGCAAGGGCGAGGAGCTGrTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAG r TC AGCG 
GOCCAGCGGTGGTACCACTCGTTCCCGCrCCrcSACAAGTGGCCCCACCACGGGrAGGACCAGCTCGACCTGCCGCTGCATTTGCCGGrGrTCAAGrc^C 



- V A 7 M y 3 K G E £ L " T Q y y p j LVtLOGOVNGHKFS 

7 'j "CCoGCGAGGGCGAGGGCGA'GCCAC C TACGGC TGACCC TGAAG TTCAfCTGCACCACCGGCAAGC TGCCCGTGCCCT GGCCCACCC TCG TGAC 
AC AGGCCGC TCCCGC TCCCGC TACGG T3GATGCC 3 ~ "CGAC TGGGAC TTC AAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGG TGGGAGC AC TG 



V 3 G E G £ G 0 A T YG<LTLKF[CTTGKLPVPVPruVT 



L* ^-CC fG~CC TACGGCG 'GCAGTGC T*TC AGCCGC '-CCCCGACCACArGAAoCAGCACGACTTC TTCAAGTCCGCCATGCCCGAAG GCTACG^CCAGGAu 
i " 5GGAC FGGATGCCGCAC j'C ACGAAGTCGGCu A'GC-GGC TGGT5 ~AC T TCGTCG 7GC TGAAGAAG T TCAGGCGG TaCGGGC TTCCGA TGC AC5TCC T'Z 



' '" --• •> J -- "' — — 

0HMI.QH0FFK S A M o £ G V V 0 £ 



WO 98/24810 1 44 PCT/EP97/06956 



Tuesday. 18 November 1997 1 1.48 

fig 34 pLM4 (1 > 10070) Site and Sequence 



Paget 



CGC ACCATC TTC TTCAAGGACGACGGCAAC TACAACACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTCG TGAACC GCATCGAGC T'CAAGGGCATCS 
GCG TGGTAGAAGAAGTTCC tGCTGCCGTTGA7GTTCTGGGCGCGGCTCCACTTCAAGCTCCCGCTGTGGGACCACTTGGCGrAGC TCGACTTCCCGTAGC 

R T t /FKODGN YKT RA EVKFE GOTL V N R t E L K G I 

ArrTCAAGGAGGACGGCAACArCCTGGGGCACAAGCr3GAGTACAACTACAACAGCCACAACGTCrATATCATG GCCGACAAGCAGAAGAACGGCA TC AA 
rGAAGTTCCTCCTGCCGTTGTAGGACCCCGTGTTCGACC TCATGTTGATCTTGTCGG TGTTGCAGATATAGTACCGGC TGTTCGTCTTCTTGCCGTAflTT 



OF KEO GM ILG HKLEYN Y N S H N V Y IHAQKQKNG I > 

GGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCC ATCGGCGACGGCCCCGTGCTGC To 
CCACTTGAAGTTCTAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCGGCTGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACGACGAC 



120: 



V N F ■ K I R H N [ E 0 G S V Q L A 0 H Y Q Q NTP I G 0 G P V L L 

CCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCG T GACCGCCGCCGGGA 
GGGCT6TTGGTGATGGACTCGTGGGTCAG6CGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTGTACCAGGACGACCTCAAGCACTGGCGGCGGCCCT 



p ONHYLSTQSALSKOPNEKRDHM 



VLLEFVTAAG 



TCACrCTCGGCATGGACGAGCTGTACAAGTCCGGACTCAGATCTCGAGCTCAAGCTTCGAATrCTGCAGT CGATAAGCTTGATATCGAATTCCTGCAGCC 
AGTGAGAGCCGTACCTCCTCGACATGTTCAGGCCTGAGTCT^^^ 



j TLGMO£ L Y K S 3 L * S P A Q A S N S A V D K L D I E F |_ 0 P 
CC7GCTCTrCAGCCAGATGCTGGACCCAGAG7C:CAGAGAAAGAGGACAGrGCAGAATGTCCTGGATCrcCGGCAGAACCTGGAAGAGACCATGTCCAG: 



GGACGAGAAGTCGGTCTACGACCTGGGTCTCAGGGTCTCTTTCTCCTGTCACGTCTTACAGGACCTAGAGGCCGTCTTGGACCTTCTCTGGTACAGGTCC 




ORF pLMI 



L L F S Q M L 0 P £ 5 Q P * RT V Q N V I 0 L R 0 N L E E T M S 3 



CrGCGAGGGTCCCAGGrGACTCACAGCTCCCTGGAGA-GACCrGCTACGACAGCGATGATGCCAACCCACGCAGCGTGTCCAGCCTCT CCAACCGC TCQ" 
GACGCTCCCAGGGrcCACTGAGTGrCGAGGGACCrCTACrGGACGATGCTGTCGCTACrACGGTTGGGTGCGTCGCACAGGrCGGAGAGGrrGGCGAG.^ 



-tnsen pLMI 



-ORF pLMI 



RGSOVTHSSL 



TCYOSOOAMPR SVSSL3NRS 



ICCC fCTGTC ATGGCGC TATGGCCAG TCCACTCZGC3 GC TGC AGGC TGGTCACGCGCCCTC rGTGGGTGGGAGCTGCCGC TCGGAGGGGACGCCCGCC T 3 



GG3GAGACAGTACCGCGATACCGGTCAGGTCAGGCGC:GACGTCCGACCACTGCGCGGGAGACACCCACCCrCGACGGCGAGCCTCCCCTGCGGGCGGA: 




■> L 



-ORF pLMI 

•1 A G 0 A P 3 v G G S C R S E G r 



WO 98/24810 145 PCT/EP97/06956 



Tuesday. 18 November 1997 1 1 :48 • p~ Q& <% 

fig 34 pLMA (i > 1Q070) Site and Sequence 9 3 

G~AeA7GCACGGCGAACGGGCCCAC TACTCCCACACCA73CCCA7GCCCA3CCCCAGCAAGC TCAGCCATaTC rCCCGCC 7G3A3CTG£ "CGAA'CCC 
CaTGTACGTGCCGCTTGCCCGGG TGA7GAGGG73 7GGTACGG37 ACGCGTCGGGG TCG7 7CGAGTCGG7A7AGAGGGCGGACC 7CGACCA3C7TAG3GA" 



-ORF pLMl 



v W H G E R A H Y 3 H 7 li P M R S P 3 K L 5 H I SRLCLVEdL 

g<ictcggatgaggtggacctcaagtccggctacatga gcgacagtgacctcatgggcaagaccatgacggaggatgatgacatcactaccggctgggatg 
ctgagcctactccacctggagttcaggccgatgtactcgctgrcactggagtacccgttctggtactgccrcctactactgtagrgatggccgaccctac: 



■insert pLMl 



-ORF pLMl 



CS0EV0LKSGYMSOSOLMGJCTMTEO00 ITTGVD 
■ . — 

AAAGCAGCTCCATCAGrAGrGGACTCAGCGATGCCT CAGACAATCTCAGTTCAGAAGAATTCAATGCCAGCTCCTCACTCAAC TCCCTCCCAAGTACrcC 
TTrCGTCGAGGTAGTCATCACCTGAGTCGCTACGGAGTCTGTTAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGTGAGTrGAGGGAGGGTTCATGAGG 



2,001 



-insert pLM1 



-ORF pLMl 



E 3 S 3 I S S GLSOASDNLSSEEFNASSStNSLPSTP 

CAZ 7GCT7CTCGCAGGAAC TCAAC AA7AGTGC TACGCAC AGACTCAGAGAAGCGC TCACTGGCAGAAAG TGGGC TGAGCTGG TTTAGTGAA7CA3AGGAG 

' ' ' 1 1 I 1 i ' I ■ ' > ■ I i i ? | V 

G7GACGAAGAGCGTCCTTGAGTTGTTATCACGA7GC3TGrCT3AGTCrCTTCGCGAGTGACCGTCTTTCACCCGACTCGACCAAArCACTfAGTCTCCTC 



-insert pLM1 



-ORF pLM1 



r A 3 R R MS T I VLRTOSEKRSLAESGLSVFSESEE 
A wACCCCCTAAAAAAC 7GGAGTACG AC AGTGG T AGCC 73AAGATGGAACC T3GGACT7C 7AAGTGGCGGAGGGAGCGGCC TGAGA3C 7GT3 AT3A7 TCi" 



•':33GGATTTTTTGACCTCATGC7GTCACCA-CG3ACrTC7ACC7rGGACCCTGAAGA77CACCGCC7CCC7CGCCGGACTCTCGACA:7ACTAAGTA 



-insert pLMl 



-ORF pLMl 



<AP KKLEYC 3 G3LKME P G TSKtfPQERP£3CDD3 

CC AAGGG 7GGAGAAC TGAA AAAGCCCATCAGCC*GG3 » C ACC Z 7GG 7 7CC C 7GAAGAAG33C AAGACCCCACC 7GTGGCTG7AAC TTCCCCC A TC AC 7C A 
GG" "CCC ACC TC 7TGAC 7T TT TCGG3 7A3TCGGACCC 33TGG3ACC AAGG3 ACTTC77CCCG 7TC 7G GGGTGG ACACCGAC AT7G AAGGG33 7A373AGT 

insert pLMl 

— ORFpLMI 

> * CiG£UK<P| S_3-P3SLK<GK 7 P P V A V T > - I TH 




WO 98/24810 



• Tuesday. 18 November 1997 1 1:46 
rig 34 pLM4 (1 > 10070) Site and Sequence 

• C ACAGCCCAGAG TGCCC TCAAA G TCGC A3GC A A AC C fGAG GGCAAAGC TACAGACAAGGG TA ACCT TG C A3 TG AAGAA T AC TGG3C TCC AACGC 7CC TCC 
GT3TCGGGTCTCACGGGAGTTTCAGCGrCCGTTrGGACTCCCGTTTCGATGrCTGTTCCCATrCGAACGrCACTTCrrArGACCCGAGGrTGC3AGGAGu 

— — insert pLM 1 m ~ mm l 

SaFpLMl 

TAOSALKVAGKPEGKATOlCGKLAVJCNrGLQRSS 

TCTGATGCTGGTCGGGACCGCCTGAGTGATGCTAAGAAGCCCCCCTCGGGCATTGCTCGCCCCTCCACTTCGGGATCCTTCGGCTACAAGAAGCCTCCTC 
AGACTACGACCAGCCCTGGCGGACTCACTACGATTCTTCGGGGGGAGCCCGrAACGAGCGGGGAGGTGAAGCCCTAGGA AGCCGATGTTCrTCG3AGGA3 ^ 

insert pLM1 

ORFpLMl I 

^OAGRORLSOAKKPPSG I APPSTSGSF G Y K K P p 

CTGCCAC AGGCACAGCCACTGTCATGCAAACTGGTGGTTCAGCCAC TC TCAGCAAGATCCAGAAGTCCTCAGGCATCCCTGTCAA GCCACTAAATG3GC'3 
GACGGTGTCCGTGTCGGTGACAGTACGTTTGACCACCAAGTCGGTGAGAGTCGTTCTAGGTCTTCAGGAGTCCGTAGGGACAGTTCGG7 CATTTACCCGC 

insert pLM1 

-ORF pLMl 

patgtatvmotggsati.sk I q k s 5 g i p v k p v n g p 

CAAGACTAGCTTAGATGTTTCCAACAGCGCAGAGCCA3GATTCCTGGCTCCTGGAGCCCGTTCTAACATCCAGTACCG CAGCCTGCCCCG3CCAGC:AA3 
G 'TCTGA TCGAATC TACAAAGGTTG TCGCGTCTCGGTCC TAAGGACCGAGGACCTCGGGCAAGATTGTAGG TCATGG CGTCGGACGGGGCCGGTCG37T3 

insert pLMl 

QRFpLMI — ZZZZ 
KTSLDVSNSAEPGFLAPGARSNI0YRSLP4?A> 



TCAAGrTCTATGAGCGTGACCG3C33GCGGGGT3GACCTCGCCCTG TGAGC AGCAGC AT TG ACCCC AG TC TCC TCAGCA CCAAGC AGSGaGGCC TTACG- 
aG'TCAAGmTAC rCGCACTGGCCGCCCGCCCCAC'ITGGAGCGGGACACTCGrCGTCGTAACTGGGGTCAGAGGAGTCGTGGTTCGTCCC'CCGuAATG'JU 



insert pLM1 — 
OFlFpUMI ■ 

S S S H S v 7 G GROG » R P V S S S I D P S L L 3 T K 0 G G L T 

C rrCCAGACTGAAGGAGCC rACCAAGC TAGCCA3 'GG3C 3GACCAC T CCAGCCCC TG TCAArCAGACAGATCGGGAAAAGCAG AAGGCCaAAGCCAAGG- 
CAAGGTC rGACTTCCTCGGATGGTTCCATCGOTCACCCSCC TCGTGAGGTCGG^^ 

insert pLMl — 



4 . c PCTVEP97/06956 
146 



Page t- 



ORF pLMl 

f ^ p L .<E : >r<7AS35TroAp.;flQTCPExEKM.:A>^ 



WO 98/24810 147 PCTYEP97/06956 



* Tuesday. 18 November 1997 1 1 :48 
fig 34 pLM4 (1 > 10070) Site and Sequence 



Page I 



AGrGGCCTTGGACrCAGAC^CArCTCCTTGAAGAGrATrGGCTCCCCAGAGAGTACT CCCAAGAACCAAGCAAGCCACCCCACAGCCACCAAGCTGGC^ 
7CACCGGAACCTCAGTCTGTTGTAGACGAACTTCTCATAACCGAGGGGTC TCTCATGAGGCT rCTTGGTT CCTrCGGTGCGG TGTCGG7GG TTCGACCGT 

-insert pLMl 



20: 



-ORF pLM1 



V A L 0 S 0 N 



SLKSI GSP E STPKNQASHPTATKt 



GAGCTGCCACCAACCCC TC TCAGGGCCACAGCGAAGAGC TTTGTCAAACCACCCTCACTAGCCAATC TTGA CAAGGTCAACTCCAACAGTC TGGATCTAC 
CTCGACGGTGGTTGGGGAGAGTCCCGGTGTCGCTTCTCGAAACAGTTTGGTGGGAGTGATCGGTTAGAACTGrTCCAGTTGAGGTTGTCAGACCTAGATG 



3 ICC 



-ORF pLMl 



ELP P TPLR AT AKS FV KPPSL ANL OKV N S N S L 0 I 

CATCATCCAGTGATACCACCCATGCTTCAAAGGTCCCAGATC TGCATGCTACAAGCTCAGCATC TGGGG GCCC TC TCCCTTCCTGCTTCACCCCCAGTCC 
GTAGTAGGTCACTATGGTGGGTACGAAG TTTCCAGGGTC TAG ACGTACGATGTTCGAG TCGTAGACCCCCG GGAGAGGGAAGGACGAAG TGGGGGTCAGu 

-insert pLMl 



-ORF pLMl 



PS SSOTT HAS KVPOLHA TSS A SGGPLPSCFTPSP 

GGCACCCATCCTCAATATTAACTCAGCCAGCTTC TCCCAGGGCC TGGAGCTAATGAG TGGTT TC AGTGT GCCAAAAGAGACCCGCATGTACCCCAAAC TC 
CCGTGGGTAGGAGTTATAATTGAGTCGGTCGAAGAGGGTCCCGGACCTCGATTACTCACCAAAGTCAC ACGGTTTTC TC TGGGCG TACATGGGG TT TGA3 

- insert pLM? 



yyy. 



-ORF pLMl 



A p * l N I NSASF50GLELMSGFSVPKETRMYPKL 



7caggcctgcacaggagcatggagtcc:tccaga.gccaatgagcctccccagtgccttccccagcagtactccc gtccccaccccacctgctccccctg 

Au-CCGGACGTGTCCTCGTACCrCAGGGAGGrc:ACGGTTAC TC GGAGGGGTCAC GGAAGGGGTCGTCATGAG3GCAGGGGrGGGGTGCACGAGGGoG^: 

-insert pLMl 



-ORF pLMl 



G I H Q S MEoLQ. M - PMSLPSAFPSST^VPTPPAPP 



C " GC TCCCACAGAAG~ AGAGACGGAACAGC TGAC ~ TGGAGTGGAAGCCCC AGAGC TGGGC AAC TGGaCaGTaA TC AGCGGGATC5GA AC AC TC T TC tC AA 
>:-!CGAGGGTGTCTTCrTCT:TGCCTTCTCGACT3 MACCTCACCT TCGGGGTCTC GACCCGTTGACCTGTCATTAGTCGCCCTAoCCTTG7GAGAAGGGTT 

-insert pLMl 



a p r 



ORF pLMl — — 

V S G S P R A G 0 L 0 3 N 0 ft D P N r L P > 



WO 98/24810 



148 



PCT/EP97/06956 



Tuesday. 18 November 1997 1 1 :48 

tig 34 pLM4 (1 > 10070) Site and Sequence 



Paget 



GAAAGGGC "CACGTACCAGCTTCAGTCCCAGGAGGAGACCAAGGAGAGCCGACATT CCCATACCA TTGG TGGGCTGCCTGAA TCCGA TGACC AG TCAG&" 
CrTTCCCGAGTCCATGGTCGAAGTCAGGGTCCTCCTCTGGrTCCTCTCCGCTGTAAGGGrArGGTAACCACCCGACGGACrTAGGCrAC FGGTCAG TCT2 



-ORF pLMl 



KGLRYOLQSOEETKERRHSHT IGG 



I P E S 0 0 Q S E 



CTGCCTTCTCCCCCTGCACTTCCCATGTCTCTGAGTGCAAAGGGCCAACTTACCAACATAGTGAGTC CCACTGCGGCCACCACGCCAAGAATCACCCGcr 
GACGGAAGAGGGGGACGTGAAGGGTACAGAGACTCACGTTTCCCGGTrGAATGGTTGTATCACTCAGGG TGACGCCGGTGGTGCGGTTCTTAGrGGaCt*ll 



-ORF pLMl 



LPSPPALPM 



LSAKGQLTN 



vsptaattprjtp 



CCAACAGCATCCCCACCCACGAGGCGGCCTTCGAGCrGTACAGCGGCTCCCAAATGGGGAGCACCCTGTCCC TGGCCGAGAGACCCAAGGGAATGATTCu 
GGTTG rCGTAGGGGTGGGTGCTCCGCCGGAAGCTCGACATGTCGCCGAGGGTTTACCCCTCG TGGGACAGGGACCGGCTC TCTGGGTTCCCTTACTAAGC 



jaoc 




-ORF pLMl 



5NS | P TH E AA F£ L YS G S QMG S T L S L A£RPKGM1R 

GTCAGGATCCTTCCGAGACCCCACGGACGATGTTCACGGCTCAGTGCTGTCCCTGGCCrcCAGTGCCTCCrCCACCTACTCCTC AGCTGAGGAGAGGATG 
CAGTCCTAQGAAGGCrCTGGGGTGCCTGCTACAAGTGCCGAGTCACGACAGGGACCGGAGGTCACGGAGGAGG TGGATGAGG AGTCGAC TCC TC TCCTAC 



•ORF pLMl 



3 G 



F R D p T D OVHGSVUSLASSASST YSSASERr 



CAArcrGAQCAAATCCGGAAGCTTCGTAGGGAACrGGAATCATCCCAGGAAAAAGTGGCCACCTTGACGTC TC AGC T 7 TC TGCC A ATGC T A A 7C "GG~GCi 
G-TAGAC FCC TT TAGGCCT rCGAAGCATCCCTTG ACC TfAG T AGGG TCCTTTTTCA CCGGTGGAACTGC AGAG TCGAAAGACGGTTACGaT TAG ACCACC 

•insert pLMl 



-ORF pLMl 



iS£ , Cla < LQRELE $SQE K V A T L T S Q L S A M A N L V 

C 'ZCTT1 TGAGC AG A3CC T 3G TQ AA TATGACA TCCCG 'C TGCGACACC TGGCAGAGACGGCCGAGGAGAAGGACAC TGAG C TGC TGGATTTGCG AC AA 
G AC G AAA AC TCG TC TCGGACCAC T Ta 7ACTGTACGGCG5A CGCTGTGGACCGTC TCTGCCGGCTCC TC T TCC TO TG AC TCGACG A CC TAAACGC TC TTTC- 

- insert pLM 1 



— — ORF pLMl - 

F E Q f • I * * [ S ^LRHLAETAEEKOTEL LD.? 



WO 98/24810 




PCT/EP97/06956 



Tuesday. 1 8 November 1997 1 1:4£ 

fig 34 pLM4 ( t > 10Q70) Site and Sequence 



Pagen 



C^rAGACfTTcrGAAGAAAAAGAACTC TGAGGCCCAoGCAGTCATrCAGG3AGCCCrT4ArGCCTCACAAACCACACCCAAA3AACTTCG-3ArCAAGAuA 
0 TaTC TG^AAGACrTCTTTTTCTTGAGACTCCGGGTCCGTCAGTAAGrCCCTCGGGAATTACGGAGTCrTTGG rGfC GGT'TrCTTGAAGCCTAGTTCTCT' 

'insert pLMt 



-ORF pLMl 



(OrLKKKNSE 



OAVrOGALNASETTPKEu^ 



CAAAACTCCTCAGATAGCATCTCAAGCCrCAACAGCATCACTAGCCATTCCAGCATCGG CAGCAGCAAGGATGCTGATGCGAAAAAGAAGAAAAAAAAGA 
GTTTTGAGGAGTCTATCGTAGAGTTCGGAGTTGTCGTAGTGATCGGTAAGGrCGTAGCCGTCGTCGTTCCTACGACTACGCrTTTTCTrCTTTTTTTTCT 



"insert pLM1 



-ORF pLMl 



CNSS0SCSSLNSITSHSSI6SSK0A0AXKK 



G T TGGGTCTA TGAGCTTCGAAGTTCCTTCAACAAAGCGTTCAGTATAAAAAAGGGGCCCAAG TCAGCTTCCTCATACTCGG A T AT AG AO GAG AT TGCTAC 
CaaCCCAGATaCTCGAAGCTTCAAGGAAGTTGTTTCGCAAGTCATATTTTTTCCCCGGGTTC AGTCGAAGGAG TATGAGCCTATA tctcctctaacgatg 



-ORF pLMl 



5 v V Y £ C*SSFN.<AFSIKKGPKSA5SYS0IE 



t I A T 



ACCCGAC TCTTCAGCCCCCTCATCCCCC AAAC TaC agcatggttccac acagactgcttcaccctccatcaag tcctccaccttgtcctccg tgggcact 

"G5GC rGAGAAGTCGGGGGAGTAGGGGG TTTGATQ TCGT ACCAACGTGTCTCTGACGAAGTGGGAGGTAGTTC AGGAGGTGGAACAGGAG3C ACCCGTGA 



•insert pLMl 



•ORF pLMl 



P0S3APSSPKL 



HGSTETASPSIKSSTLSSVGT 



Ga 7GfCACCGAGGGCCC TGC TCACCC AGCCCCCCACAC T 4GCC TGTTCCArGCAAATGAGGAGGAGGAGCC AGAGAA GAAGG A3G TATCG3 AGC "GCGCT 
C-ACA3TGGCTCCCGGGACGAGTGGGTCGGGGGGTGrgATCCGACAAGGTACGTTTACTCCTCCTCCTCGGTC fCT TC TTCC TCC ATAGCC TCG ACGCGA 



•insert pLMl 



-ORF pLMl 



vT "EG?AHPAPHTRLFHAN£cE£ 



K K £ V S E L R 



C "3 A3C TATGGGAGAAGGAAA TGAAGC TTaCaQaC a f3 3 3C T TGGAGGCCC TCAACTC TZ CC 3 A CCA AC TGGA TCaGC T TCG33AGACC AF3CACA ACA7 
G^:r:GATACCCTCrrCCTrTACrrCGAATG-C"'3'A33CGAACCTCCGGGAGTTGAGACoGGTGGTTGACCTAGTCGAAGC:CTCTGG'A:GrGTTGTA 



• insert pLM 1 



-ORF pLMl 



l v g < e m < l r : 



* L E 4LN5AH0L0 0L 



£ X '% H fl r 
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fig 34 ptM4 f 1 > 10070) Stte and Sequence „ . 

G»_*-0 7 7C 3AuG 7GGACC TCC 7GAAAGC AGAGAA 7G AC CGAC 7GAAGG 7 AGCCCCAGGCCCC 7CA 7C ACCC FCC AC 7CCACGGC AGG 7CCC 73GA 7C A7C7 

' 1 ' ' ' ' i 1 1 1 ■ — t- 

CuTCAACCTCCACCTGGACGACrTTCGTCTCTTACTGGCrGACTTCCATCGjGGTCCGGGGAGTAGTCCGAGGTGAGGTCCCGTCCAGCGACCTAGrAGA 



-insert pLMI 



-ORF pLMI 



QtEVDLLK A E N 0 R L K V AP GPS SG ST PGQVJqs-3 

GCATT ATCTTCCCCACGCCGCTCCCTAGGCCTGGCAC TCACCCATTCC TTCGGCCCCAGTCTTGCAGACACAGACCTGTCACCCATGGA75G CATC AGTA 
CGTAATAGAAGGGGTGCGGCGAGGGATCCGGACCGTGAGTGGGTAAGGAAGCCGGGGTCAGAACGTCTGTGTC TGGACAGTGGGTACC~ACCGTAG TCA7* 



■insert pLMI 



-ORF pLM1 



ALSSPRRSLGLALTHSFGPSLADTDLSPMDG I S 

CTrGTGGTCCAAAGGAGGAAGTGACCCTCCGGGTGGTGGTGAGGATGCCCCCGCAGCACATCATCAAAGGGGACTTGAAGCAGCAGGAATTCTTCCrGGG 

1 1 ' ' ■ — — ' • *- — ' — 1 1 1 1 1 1 ' ' 1 i 5C"C'.* 

GAACACCAGGTTTCCTCCTrCACTGGGAGGCCCACCACCACTCCTACGGGGGCGTCGTGTAG TAGTTTCCCCTGAACTTCGTCGTCCTTAAGAAGGACCC 



•insert pLM1 



-ORF pLM1 



TCGPKEEV TL«VVVRMPP0HI tKGDLKOQEFFLG 

CrGTAGCAAGGrCAGTGGAAAAGTTGAC TGGAAGATGCTGGATGAAGCTGTTTTCCAAGrGTTCAAGGACTATATTTCTAAAATGGACCCAGCC TC TACC 

... i <>i ;;,; 

GACATCGTTCCAGrCACCTTTTCAACTGACCTTCTACGACCTACTTCGACAAAAGGTTCACAAGTTCCrGATATAAAGATTTTACCTGGGTCGGAGATGG 



-insert pLMI 



•ORF pLM1 



CSKVSGKVOVKMLDEAVFQVFKOY13KMD3AST 

C 73CGAC TAAGCACTGAGTCCATCC ATGGCTAC AGCA 7C AGCCACG TGAAACGAG TG 77GGATGCAGACCCCCCCGAGATGCC TCCTTCC-GTCGAGGT j 
■ ■ — — . ■ i ,,.»■■! [ , , t 

GA'CC FGATTCG TGAC 7CAGG7 AGG TACCGA^GTCGTAG TCGGTGCACTT TGC TCAC AACC T ACG7CTCGGGGGGC TCTACGGAGGAACG GCAGCTCCA'J 



-insert pLMI 



-ORF pLM1 



L 5 I = T E 3 t HG YS I SHVKRVLDAEPPEMPPCRRG 

TCAA T AACA 7 A7C AGTC 7CCC TC AAAGG TC TG~ AGG AG A AATGCGTCG AC AGCCTGGTGTTCGAGACGC TG ATCCCCAAGCCG ATGATGCAGCACT AC A' 
^— ^_ , , „ i }vv 

AG'7ArTG7ArAGTCAGAGGGACTTTCCAGAC7 7CCTCTTTACGCAGC TG TCGGACCACAAGCTCTGCG AC TAGGGGTTCGGC TAC T AC G TC G TG A TGT A 



• insert pLM 1 



■ ORFpLMl 

V'lfllSvSLKG*. <£XCV03LVFETLIPK*MK H Y 
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tig 34 pLM4 (1 > 1QQ70) Site and Sequence 



Page<i 



«*3CC rCCTGCTGAAGC ACCGGCGCC 7CGTCC TZ 7C3GGCCCC AGCGGCACGGGC AAGACC TACC fGACCAATCGC TTCGCC3 AG TACC ^coTuGAG^G" 



T-rGGAGGACGACTTCGrGGCCGCGGAGCAGGAGAGCCCGGGGTCGCCGTGCCCGTTCTGGATGGACTGGrTAGCG AACCGGCrCATGGACCACCTCG'--^ 

-insert pLM1 



-ORF pLMl 



3 L L L K H R R L V L 5GPSGTGKTYL fNRLAEYLVER 



TCrGGCCGTGAGGTCACAGAGGGCATCGTCAGCACCTTCAACATGCA CCAGCAGTCTTGCAAGGATCTGCAAC TGTATCTTTCCAACCTAGCCAACCAGA 
AGACCGGCACTCCAGTGTCTCCCGTAGCAGTCGTGGAAGTTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGACATAGAAAG GTrGGATCGGTTGGTC- 

- insert pLMl 



-ORF pLM1 



S G R E V TEG1 V S TF NW HQQSC K 0 I Q L Y L S N L A N 0 

T>GACCGGGAAA CAGGAATrGGGGATGTGCCCCTGGTGArTCTATTGGATGACCTGAGTGAAGCAGGCTCCATCAGTGAGTTGGTCAATGGGGCCCTCAr 
1 1 ' 1 1 ' jii i i i i i i ) i | | | * 

ATCrGGCCCTTTGTCCTTAACCCCrACACGGGGACCACTAAGATAACCTACTGGACTCACTTCGTCCGAGGTAGTCACTCAACCAGTTACCCCGGGAGTU " * 



-insert pLMl 



-ORF pLMl 



>Q R ETGI GDV PLVIL L0 DLSEAGSISELVNGALT 

C-cCAAGTATCATAAArGTCCCTATATTATAGGTACCACCAArCAGCCTGTAAAAATGACACCCAACCATGGCTTG CACTTGAGCTTCAGGATGTTGACC 
GA:GTrCATAGTATTTACAGGGATATAATATCCATGGTGGTTAGTCGGACArTTTTACTGTGGGTTGGTACCGAACGTGAACrCGAAGTCCTACAACTG3 



-insert pLMl 



-ORF pLMl 



< y H K C P Y i ig 



TMQPVKMTPNHGLHLSF 



,M L T 



-':rcCAACAACGrGGACCCAGCCAA7G3CTTc:'G3TTCGTTACCTGAGGAGGAAGCTG GTAGAGTCAGACAGCGACATCAATGCCAACAHGGAAGA» t : 
^~:a33TTGTTGCACCTCG5TCGGT7ACCGAAG3AC:aaGCAATGGAC rcCrCCTTCGACCArCTCAGTCTGTCGCTGTAGrrACGGTTOrTCCTTCTr: 



-insert pLM1 



-ORF pLMt 



I 5 >l " v * p A M G F L V R YLRRKLVESOSOINAHICEE 

'<jZ~ rCGGGTGC TCGAC 7C3G7AC3C AA3C TG73CA "C ATC 7CCACACC T7CCTTG AGAAGCAC AGCACC TC AGAC7TCC 73 ATCGGCCC 77GCT 7C T~ 
AC3AAGCCCACGAGC f 3ACCCA7GGG 77CGACA3C a 7a3 fAC AGGTGTGGAAGGAAC 7CT rCGTG7CGTGG AG TC TGAAGGAGTA3CCGG3AA33AAGAA 



- insert pLM 1 



~ — ORFpLMI 

R V L 0 * V 3 < L * H L H T F L E K H S 7 3 D F L 1 G :> : F F 
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fig 34 pLM4 (1 > 1Q070) Site and Sequence 



Page (c 



TC 7G TCG TGTCCCATTGGC ATTGAGGAC TTCCGGACC TGGT TCAT TGACC rGTGGAACAACTCTATCAr TCCC rATCTA CAuGAASGAGCCAASSATG^ 
AGACAGCACAGGGrAACCGrAACTCCTGAilGGCCTGGACCAAGTAACrGGACACCTTGrrGAGATAGrAAGGGArAGATGrccrTCCrCGGTrcCTACC: 



-OAF pLMi 



L S C , P IG tE CFRTW F [D UW NNSttPYLOEc 



A < 0 G 



ATAAAGG TCCATGGACAGAAAGC TGCTTGGGAGGACCCAGTGGAATGGGTCCGGGACACACTTCCC TGGCCATCAGCCC AACAAGACCAAfCAAAGCTG' 
rA TTTCC AGG TACC TGTCTTTCGACGAACCCTCC TGGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCGG TAGTCGGGTTGTTC TGGTTAG TTTCGAlI ^ 




rKV HGQKAAV EDP VEv V R 0 TLPVPSAQQOQS 



K L 



ACCACCTGCCCCC 



CCACCCACCGTGGGCCCTCACAGCATTGCCTCACCTCCCGAGGATAGGACAGTCAAAGACAGCACCCCA AGTTCTCTGGACTCAGATCC 
TGGTGGACGGGGGTGGGTGGCACCCGGGAGTGTCG TAACGGAGTGG AGGCC TCCTATCCTGTCAGTTTCTGTCGTGGGGTTCAAGAGACCTGAG TC TAGS ^ 




Y H L P P P T . V G P S | * S P P E 0 R T V K Q S T P S S L 0 S D P 

TCTGATGGCCATGC TGC ^G AAA GTTCAAGAAGCTGCCAACTACATTGAGTCTCCAGATCGAGAAACCATCCTGGACCCCAACCTTCAGG CAACACTTTAA 
A'JACrACCGGTACGACGACTTTGAAGTrCTTCGACGGTrGATGTAACTCAGAGGTCTAGC TC rTTGGTAGGACCTGGGGTTGGAAG TCCGTTGTGAAATT 

- insert pLM 1 




-ORF pLM1 



LMAHLLKLQEAAN VtE SPDRETlL 0 P N L Q A T L 

GGG 7 TCGGCAATCAC TG7C ACCCCCGGACACCAGAACGI TGGCATCAGCTATCTTAGCTCCTCC TCTCCCC TC TCC TCTTT CAGAGCAC " 5GCTCTCC A'J 
CCZ AAGCCG TTAG TGAC AG TGGGGGCC T3TCG TC T T3CG ACCGTAG TCGA TAGAATCGAGGAGGAGAGGGG AGAGGAGAAAG TC TCGTG ACCGAGAGG TZ 



-insert pLMi 



^^GNHCHPR T a£3vhQIS • I L L S P L l F Q 3 T G $ F 

C''CCAGGAGGAGAAC AGGA3GGAG G AGG AGA"GAAaG A3 jAGGGAC AGGT TCT TGGTGCTGT ACCT f TGAGAAC T TCC TACGAAGGAATGG TGGG G TGGC 

gouo rcc rcc rc ttg tcc tccc tcc rcc tc tac ccctccctg tccaagaaccacgaca tggaaac tc ttgaaggatcc ttccttaccacccc accj 



3 G G E 0 



-insert pLMi 



E G 



G 0 



5GTG3VCCTFENFLGRN 



G-TrGGGAACT rGTGCCCC:TAAACACArTTAC:G3::r:cr:rAATGACrTTGGGGAAAAGATGArTCTGGGTCrT-CCr TGACTTC-rGr rrcAAr - 
'-"A^ACCC f rCAACACGGGGGA T 7 T3 TG TAAA7GAC I3C~3GA3A T7AC TG AAACCCC 7 7TTC TAC TAACACCC AGAAAGGGAAC rpAAQ A AC AA AC TTAJ 



r. ii L C P L 



-insert pLMi — 

L. L V G K 0 0 S G 



3 I 
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fig 34 pLM4 (1 > 10070) Site and Sequence 



Page 1 1 



AC AAAC7CC7GGGC 777C7GGGGAGGGG 77CAGAAAACA TC AAA AC AC 7GCAGCAGTT CCCCGGAA77CAGC T fGGAC T7AACCA3GCTGAAC77GC fCA 
rGTrTGAGGACCCGAAAGACCCCTCCCCAAGTCTTTTGTAGTTTTGTGACGTCGTCAAGGGGCCTTAAGTCGAACCrGAATrGGTCCGACTTGAACGAGT ^ 



-insert pLMi 



T N S V A F V G G V Q K T S K H C SSSPEFSLDL TR L NLL 
AAAGAAGCCGAATTCCAGCACACTGGCGGCCGTTACTAGTTCTAGATAACTGATCATAATCAGCCATACCACArTT 

tttcttcggcttaaggtcgtgtgaccgccggcaatgatcaagatctattgactagtattagtcggtatggtgtaaacatc TccAAAATGAACGAAArTT" SaC * 

> , 

insert pLMi ' 
xftSRIPAHVRPLLVLDN.S 



S A t P h L 



R F V L L 



AACC tcccacacctccccc tgaacctgaaacataaaatgaatgcaattgttgttgttaacttgtttattgcagctt ataatggtt ACAAATAAAGCAATA 

TTGGAGGGTGTGGAGGGGGACTTGGACTTTGTATTTTACTTACGTTAACAACAACAATTGAACAAATAACGTCGAATATTACCAArGTTTATTTCGTTAr 
T S H 7 S P . T -NIK . M Q L L L L T C L L Q L I K V T N K A I 

GCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAAT GTArCTTAACGCGTAAATTGTAAGCGTTA 
CGTAGTGTTTAAAGTGTTTATTTCGTAAAAAAAGTGACGTAAGATCAACACCAAACAGGrTTGAGTAGTTACATAGAATTCCGCATTTAACATTCGCAAT ^ 



ASOISOIICHFFHC 



LVVVCPNSSMYLNA 



rw — = 

l V s V 



ATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGG CAAAATCCCTTATAAATCAAAAGAATAGAC 
rATAAAACAATTTTAAGCGCAArTTAAAAACAATrrAGTCGAGTAAAAAATTGGTTATCCGGCTTTAGCCGTTTTAGGGAATATTTAGTTTTCTTATCTG 



no.: 




CGAGArAGGGTTGAGTGTTGrTCCAGTrTGGAArAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAA GGGCGAAAAACCGTCTATCAGGGCGAT 
GCTC 7A7CCCAAC7CACAACAAGG7CAAACC 773T7C7C AGG TGATAATTTCTTGCACCTGAGGTTGCAGTTTCCCGC TTTTTGGCAGATAG TCCCGCTA 



rrton 



E IGLSVVPVUNKSPLLKNVOSNV 



K G R K T VYOGO 



GG'JCCAC 7ACG7GAACC A7CACCC 7 AA7CAAG7T7 77 7GGGG 7CGAGG7GCCG7AAAGCAC7AAA7CGGAACCC 7AAAGGCAGCC CCCC- A7 7 TAuAGC TT 
CCGGGrGATGCACrrGGTAGTGGGArrAGTTCAAAAAACCCCAGCTCCACGGCATTTCGTGArTTAGCCTTGGGATTrcCCrCGGGGGCTAAArCTCGAA 



L R E P 5 P 



S 3 F 



GSRCRK4LNRNPXGSPPFRA 



GA:CG3GAAAGCCGGCGAA:G7GGCGA3AAA.::AAG:3AAGAAAGCGAAAGGAGCGGGC3CTAGGGCGCTGGCAAGT«:TAGCGGr:A.:CCTGCGCGTAAC 



C-SCCCCrTTCGGCCGCTTGCACCGCrCTTTCCTCCrTTCTrTCGCTrTCCTCGCCCGCGATCCCGCGACCGTTOACATCGCCASTGCGACGCGCArTG 



5 G K P A « V A R < £ 



KKAKGAGARALASVAVTLRVT 



CA:CACA:CCGCCGCGC77AA7GCGCC3C7ACA:«:GCGCG7CAGG7GGCAC77TTCGGGGA AA7G7GCGCGGAACCCCTA77 7Gr7'.i'-7 77C7AAATM 
G:CGrGTCGGCGGCGCGAAr7A:GCGC:GA737:cc3;3CAGrCCACCG7GAAAAGCCC: 77 rACACGCGCCr7G0GGA7AAACA2Ar^AAAGAT 77.17 ' 



WO 98/24810 154 PCT/EP97/06956 



. Tuesday. 18 November 1997 1 1 :48 p aQ • 

fig 34 pLM* (1 > 10Q70) Site and Sequence 9 

CAT rCAAAfATGTATCCCC TCATGAG AC AATAACCCTGA rAAATGC TrCAATAATATTGAAAAAGGAAGAGTCCTGAGGCGGAAAGAACCAGCrGTGGAA 
GrAAGTTrATACArAGGCGAGTACTCTGrTATTGGGACTATTrACGAAGTTATTATAACrTTrTCCTTCTCAGGACTCCGCCrTTCTTGGrCGACACCTT ^ 
HSN MY PLMRQ . P. .MLQ . Y . K R K S P E A E R T $ C G 

rGrGrGTCAGTrAGGGrGrGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGrATGCAAAGCArGCArCTCAATTAG TCAGCAACCAGGTGrGGAAAGTC: 
ACACACAGrCAATCCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCrTCATACGrTTCGTACGrAGAGTTAATCAGTCGTTGGTCCACACCTTTCAG3 ^ 
HCVS . G V E S P Q A P Q Q A £ V C K A C ISISQQPGVESP 

CCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGC CCCTAACTCCGCCCATCCCGCCCCTAACTCCGC 
GGTCCGAGGGGTCG TCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGGGGATTGAGGCGGGTAGGGCGGGGATTGAGGC3 
O AP QQA EVCKAC I S rS QQ P.SRP.LRPSRP.LP. 

CCAGTrcCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGG CCTCTGAGCTATTCCAGAAG TAGTGAGG 
GGTCAAGGCGGGTAAGAGGCGGGGTACCGACTGATTAAAAAAAATAAATACGTCTCCGGCTCCGGCGGAGCCGGAGACTCGATAAGGTCTTCATCACTCC 
PVP P I LRPMA O . FFIFHORPRPP RPLSYSRSSE 

AGGCrTTTTTGGAGGCCTAGGCTTTTGCAAAGATCGATC AAGAGACAGGATGAGGATCGTTTCGCATGATTGAACAAGATGGATTGCACGCAGGTTCTCC 
""""* *"' ( "' " " ' ' " 1 ' 1 ' " ' < ■ '- ■■ ■ i i - 1 1 1 iii i i i ii i ■ . i ■ | , ,., , , J g/>v 

TCCGAAAAAACCTCCGGATCCGAAAACGTTTCTAGCTAGTTCTCTGTCCTACTCCTAGCAAAGCGTACTAACTTGTTCTACCTAACGTGCGTCCAAGAG2 



EAFLEA. AFA KIQQE TG .G SFRM I EGDGLHAGSF 
GGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCA GCGCAGGGGCGCCCG 

ccggcgaacccacctctccgataagccgatactgacccgtgttgtctgttagccgacgagactacggcggcacaaggccgacagtcgcgtccccgcggg: 



3 1 OC 



A A V V E R L F G Y 0 V A Q Q T I G CSOAAVFRLSAQGRP 

G"7CTTT TTGTCAAGACCGACCrGTCCGGTGCCCrGAAT GAACTGCAAGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG 
CAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTTCTGCTCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTC 



*' L 7 v * T 0 L S G A L N E L ODEAARLSVLATTGVPCA 

C "3TGC TCGACG TTGTC AC fGAAGCuGGAAGGGACTGuC TGC TATTGGGCGAAGTGCCGGGGC AGGATCTCCTGTCATCTCACC T TGC TCC TGCCGAGAA 
GArACGAGCTGCAACAGTGACTTCGCCCTTCCCTGACCGACGATAACCCGCrTCACGGCCCCGTCCTAGAGGACAGTAGAGrGGAACGAGGACGGCrCT: 



-Kan-Neo 



A v L ? v v T £ * G R 5 V t L L G £ VPGQDUL S3HLAPA£> 

AG 7 a rcC A ATGGC TGATGCAA TGZG3CGGC T2C AT ACGC T TGATCCGGC TACC TGCCC AT TCGACCACC AAGCGAAAC ATCGC A TCG A5CGAGC ACG" 
"CA 7AGG TAGTACCGAC TACGTTaCGCCGCCGaCG ?a TGCGA ACTAGGCCGATGGACGGG TAAGCTGGTGG TTCGC TTTGTAGCG TAGC rCGCTCGTGC a 



s-c-: 



7 5 A O A M PRLHTLOPATCPFOHOAKHR I £ R A F 

AC 'CGGA rGGAAGCCGGTCTTGTCGATCAGGAT^ATCTGGACGAACAGCATCAGGGGCTCGCGCCAGCCGAAC rGTTCGCCAGGC TCAAGGCGAGC ATG: 



'PAGCC TaCC T TCGGCC AGAACAGC " AG TCC7AC 'A3 ACC TGC r TC TCG T AG FCCCC GaGCGCGG TCGGC T TG AC AAGCGGTCCG AG TTCC GC TCG TACU 

— ■ — *> n ■ ■ 1 

3 M E A G l V C 0 0 . 0 E • H 0 G L A P A £ l F A R U ic a 3 f 



35:-: 



WO 98/24810 155 PCT/EP97/06956 



Tuesday. 18 November 1997 1 1 :48 P ag8 \$ 
fig 34 pLM4 (1 > 10070) Site and Sequence 

CCGACGGCGAGGATCTCCTCGTGACCCATGCC GATGCCTGCT TGCCGAATATCATGG TGGAAAATGGCCGC TTTTCTGGA t TCaTCGAC "G TCGCCGGC" 
GGC fGCCGCTCC TAGAGCAGCAC TGGGTACCGCTACGGACGAACGGCTTA TAGTACCACCTTTTACCGGCGAAAAGACC TAAGTAGC TGAC ACCGGCCGA 



POGeOLVVTMGOACLPNtMVENGRFSGFiOCGRL 

GGG TG TGGCG GACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCrGACCGCTT C C TCG TGC TT TACGG" 
CCCACACCGCCTGGCGATAGTCCTGTATCGCAACCGATGGGCACTATAACGACTTCTCGAACCGCCGCTTACCCGACTGGCGAAGGAGCACGAAATGCCA 3 ^' 



GVAORYQO IALATRO tAEELGGEVAORFLVLYG 
ATCGCCGCTCCCGATTCGC AGCGCATCGCC TTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGAC TC TGGGGTTCGAAATGACCGACCAAGCGACGCC 

, , , , , ( ~ ' ► sac: 

TAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAAC TGCTCAAGAAGACTCGCCCTGAGACCCCAAGCTTTAC TGGCTGGTTCGCTGCG3 



3 



iaaposqr iafyrlldeff aglwgsk . p t k r r 

caacctgccatcacgagatttcgattccaccgccgccttctatgaaaggttgggcttcggaatcgttttccgggacgccggctggatgatcctccagcg: 

* ■ ' ■ ■ 1 ■ i < • h ' 1 ' 1 3$e<: 

gttggacggtagtgctctaaagctaaggtggcggcggaagatactttccaacccgaagccttagcaaaaggccctgcggccgacctactaggaggtcgcu 

ptchheisippppsmkgvasesfsgtpag.sssa 
ggggatctcatgctggagttcttcgcccaccctagggggaggctaactgaaacacggaaggagacaataccggaaggaacccgcgctatgacggcaatam 

r I 1 1 I f r - I I I t I t - I 1 i 1 t ■ 1 ■ I C>:jO«" 

cccctagagtacgacctcaagaagcgggtgggatccccctccgattgactttgtgccttcctctgttatggccttccttgggcgcgatactgccgttatt 

GISCVSSSPTLGGG. LKHGRROYRKEPAL.RQ 

aaagacagaataaaacgcacggtgttgggtcgtttgttcataaacgcggggttcggtcccagggctggcactctgtcgataccccaccgagaccccatts 
, i 1 ' < ' 1 < : 

rTTCrGTCTTATTTTGCGTGCCACAACCCAGCAAACAAGTATTTGCGCCCCAAGCCAGGGTCCCGACCGTGAGACAGCTATGGGGTGGCTCTGGGGTAAC 
:< 0 R I KRTVLGRLF INAGFGPRAGTL5 IPHROP I 

GGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAGGCCCTGC: 

. ■ ■ 1 > 1 1 t « 1 1 1 ' ; » » ^2C«* 

CCCGGTTATGCGGGCGCaAAGAAGGAAAAGGGGTGGGGTGGGGGGTTC AAGCCCACTTCCGGGTCCCGAGCGTCGGTTGC AGCCCCGCCGTCCGGGACGl: 

GANTPAFLPF» -PTP0VRVKA0G5QPTSGRQALF 
A TAGCCTCAGG TTACTCATATA7 ACTTT AGATTjATT TAAAACT TCATITTTAATTT AAAAGGATC TAGGTGAAGATCCTTTT73 ATAA7C TCATGACC* 

, , , , 1 ■ 1 1 . 1 

ta7cggagtccaatgagtatatatgaaatc taac taaat tt tgaagtaaaaattaaattttcctaga tccact tctaggaaaaac tatt agagtac tgg- 

PQVTHI / F R L I N F I F N L K G S R R S F L I IS.F 

AiArCCCTTAACGTGAGTTrrCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGA TC TTC TTGAGA TCCTTTTTTTCTSCGCGTAATCTGCTJ 

■ . i . i ■ t 1 i- 

rrrAGGGAATrGCACTCAAAAGCAA2GTGAC:CGCA2T:TGGGGCATCTTTrCTAGTTTCCTAGAAGAACTCTAGGAAAAAAAGA:GCGCATTAGACGA: 



< S L N VSFR3TER0TP.KRSK0LLE1LFFCA.SA 

c-gcaaacaaaaaaaccaccgctaccagcggt^gtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgca: 

1 ■ I I 1 t 1 1 1 I i ■ 1 t "l-f * 

GAACG rTrGTTFTTT 1" GG TGGCGA7GGTCGCCACC AAAC AAACGGCCT AG T TCTCGATGG T TGAGAAAAAGGC TTCCATTGACC3AAGTC37C TCGCG TC 



A C K 0 < N H R Y C ? V F 7 C R IKSYOLFFRP. L A 3 A E R F 



WO 98/24810 1 5g PCT/EP97/06956 



Tuesday. 18 November 1997 1 1 :48 p . 

fig 34 pLM4 (1 > 10070) Site and Sequence 

ArACCAAATACTGrCCTTCTAGTGrAGCCGTAGTTAGGCCACCACrTCAAGAACTCTGTAGCA CCGCCTACAr^CC TCGC TC TGC FAATCC TGTTACCA3 
TATGGTTTATCACAGGAAGATCACATCGGCATCAA TCCGGTCGTGAAGTTCTTCAGaCaTCGTGGCGGaTG TA TGGAGCGAGACGATTAGGACAATGG T-** ^ 



rOlUSF.CSRS.ATTSRTL.HRLHTSLC.S 



C Y 0 



TGGCTGC TGCCAGTGGCGATAAGTCGTG TCTTACCGGGTTGGAC TCAAGACGATAGTTACCGGATAAGGCGC AGCGGTCGGGC TG AACGGGGGG TTCG T3 
ACCGACGACGGTCACCGCTATTCAGCAC AGAATGGCCCAACCTGAGTTCTGCTATCAATGGCCTATTCCGCGTCGCCAG CCCGACTTGCCCCCCAAGCAC 97 ° 

pOCoti 

V L L P V A 1 S R V L PGVTQDOSYR IRRSGRAERGVR 



CACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGArACCTACAGCGTGAGCrATGAGAAAGCGCC ACGCTTCCCGAAGGGAGAAAGGCGGAC 
GTGTG TCGGGTCGAACCTCGCTTGCTGGATGTGGCTTGACTCTATGGATGTCGCACTCGATACTCTT TCGC GG TGC GAAGGGCTTCCCTCTTTCCGCC T2 ^ 



-pUCon ■ 



A H S P A V S E R P T P N . 0 T Y S V S YEKAPRFPKGERRT 

AGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACG AGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAG TCCTG TCGGGTTTCGCCACC 
TCCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCGAAGGTCCCCCTTTGCGGACCATAGAAATATCAGGACAGCCCAAAGCGGTGu * 



IpUCwi 



C !R AAGSEQE SARGSFQGETPG IF IV 



L S G F A T 



TCTGACTTGAGCGTCGAHTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCC TTTTTACGGTTCCTGGCCTTTTGCT^ 

agactgaactcgcacctaaaaacactacgagcagtccccccgcctcggatacctttttgcggtcgttgcgccggaaaaatgccaaggaccggaaaacga: ! °' Jl 



11 pUCOfl - ' f 

S 0 L S V 0 F C 0 A R Q G G GAYGKTPATRPF Y G S V P F 

GCCTTfTGCTCACATGTTCTTTCCTGCGTTATCCC CTGATTCTGTGGATAACCGTATTACCGCCATGCAT 
CGGAAAACGAGTGTACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGCATAATGGCGGTACGTA ] ° 0>J 
GLLLTCSFLRYPl ILVlTVLPPCI 



WO 98/24810 1 57 PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 
fig30pEGFP72 (1 >9697) Sitean 
Enzymes : 72 of 1 46 en 

Settings: Linear. Certain Sites Only. Standard Genetic Code 



fig 0OpEGFP72 (1 > 9697) Site and Sequence >f / 

Enzymes : 72 of 146 enzymes (Filtered) 6 



TAGTTATr*ATAGTA&TCAyTACGCGGTCATTAGTTC^^ 

ATCAATAATTATCATTASTrAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTG'j: 
. Ll lVtNV GV ISS . P IYG V P R Y I T T C K V p a V L T 

A** H Aat II 

CCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCC CATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGG7 
GGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATGCCA 
AQ RPP PI DVN NOVCS HSNANROFPLTSMGGVFTV 

Bgl I Ndo I Aat II p g i | 

AAACTGCCCACTTGG CAGT ACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCC TGGCATTATGCCCAGTA 
" "*" 1 ' 1 11 ' " " ' ■■■■»■.. i i ■ i ■ ■ i i i 1 — . . ! . , l 

TTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCAr 
N C P L C S T S S V S V A K Y A P Y R Q R MARLALCPV 

SnaB I ^Jco I 

CArGACCTTATGGGACTTTC CTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGrTTTGGCAGTACATCAATGGGCGTGGA 

1 i i ■ > i .... ■ . i ■ | i i ■ t i i,i | , . | . , t . i i ■ i | i .I,., — uV 

GrACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCT " 
HQL NGLSY LA VHL R f S H R YYHGD A V L A V H Q V A V 

jAat II 

TA3CGGTTTGAC TCACG3G3A TTTCCAA3TC7CC ACCCCATTGACGTC AA TGGGAGTTTGTT TTGGCACCAAAATCAACGGGACTTTCC AAAATGTCGTm 

' 1 ' ' ' ' I ' i ■ > ' 1 i I i t ■ S'V* 

A'CGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAG TTACCCTCAAACAAAACCGTGGTTTTAGTTGCCC rGAAAGGTTTTACAGCAT 

! AV : I Ts ' SX SPPH . RQVEFVLAPKSTGL5ICH3 . 

Nhe I £c47 

i 1 

m'-AAC TCCGCCCCATTGaC GC AAATGGGCGGTaGGCGTGTAC GGTGGGAGGTCTATATAAGC AGAGCTGGTTTAGTGAACCGTCAGATCC3CTAGCGCTA 
TGTTGAGGCGGGGrAAC'GCGTTrACCCGCCATCCGCAC ATGCCACCC TCCAGATATATTCGTC TCGACCAAATC ACTTGGC A3TCTAGGCGATCGCGA? 
0 I R P tD AN GR . A CTVGGLYK03VFS£PSD ?LAL 

,Nco I 

CC GC rCCCC ACCATG3T3z3CAAG3GC3AGGAG:"GrTCACCGGGG TGGTGCCCATCCTGG TCGAGC T GGACGGCGACGTAAACGGCCACAAGT TC AGi*5 
G0:CAGCGGTGGTACCA:7:GTrCCCGCTCC'CGACAA3rGGCCCCACCACGGGTAGGACCAGCTCGACCrGCCGCTGCA7rTGCCGGTGrrCAAGrCG: 



-9GFP.C.e.unc53 



= VA r "v3KQE ELr TG VVP 1 LVE10GDVNGHKF3 

'G'CCGGCGAGGGCGAGwG'GaTGCC ACCTACGGC AAGC TGACCCTGAAG T TCATCTGCACCACCGGCAAGCTGCCC G TGCCC TGGCCC ACCC TCGTGAC 
^CAGGCCGCTCCCGCrCCC^CTACGGTGGATGCC^rCGACTGGGACrTCAAGTAGACGTGGTGGCCGTTCGACGGGCACGGuAICGGGTaGGAGCACTJ ^ 



' 'e3FP.C.e.unc53 ■ 

SCEGgSDAT > Y::<LT LKFlCTTGKLPVPVPrLVT 



WO 98/24810 4CO PCT7EP97/06956 

158 



Tuesday. 18 November 1997 10:34 p ^ 

fig 30 pEGFP72 (1 > 9697) Site and Sequence y A 



CACCC TGACC TACGGCG 7GCAGTGC TTCAGCCGC 7ACCCCGACCACATGAAGCAGCACGAC T TC T7CAAGTCC 3CCA 7CCCC GAAGGC7ACG TCCAGGA3 
GrGGGACTGGArGCCGCACGTCACGAAGTCGGCGATGGGGCTGGrGTACTrCGTCGTGCTGAAGAAGTTCAGGCGGTACGGGCTTCCGATSCAGGTCCr:- 



-©GFP.C.e.unc53 



TLrrGVQCP5RYP0HNK0H0FFXSAHPeGVV0E 

J<spl 

CCCACCATCTTCTTCAAGGACGACGGCAAC TACAAGACCCGCGCCGAGGTG AAGTTCGAGGGCGACACCCTGG TG AACCGCATCGAGCTGAAGGGC ATCG 
GCGTGGTAGAAGAAGTrCCrGCTGCCGTTGATGTTCTGGGCGCGGCTCCACTTCAAGCTCCCGCTGTGGGACC AC T TGGC GTAGC TCGACTTCCCG TAG C ' ^ 



-eGFP.C.e.unc53 



R r IFFKDOGNYK T R A E VKFEGDTL V N R I E L K G I 

ACTTCAAGGAGGACGGCAACATCCTGGGGCACAACCrGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAA 

1 ill I i K I IC 

TGAAGTTCCTCCTGCCGTTGTAGGACCCCGTGTTCGACCTCArGTTGA T6TTGTCGG TGTTGCAGATATAG TACCGGC TGTTCGTCTTCTTGCCGTAGTT 



-eGFP.C.e.unc53 



OFKEOGNILGHKLEVNYNSHNVY IMADKQKNG IK 

GGrGAACTTCAAGArCCGCCACAACATCGAGGACGGC AGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTG 

' 1 1 ! ' 1 ' i i i ■ i i < i i i i t ■ i , 1 2/--,- 

CCACTTGAAGTTCrAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCGGCTGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACGACGAC 



-eGFP.C.e.unc53 



V N F K I R HN I EOGSVQUAOHYQQN TP IGDGPVLL 

CCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGA3CAAAGACCCCAACG AGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACC3CC3CCGGGA 
GGGCTGTTGGTGATGGACTCGTGGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTGTACCAGGACGACCTCAAGC ACTGGCGGCGGCCCT 



-eGFP.C.e.unc53 



3 DN HYLSrQS ALS K DPNEKROHMVULEFVTAAG 

Asu il 

0spM II 0gl II EcoN , 

I * i 

rCACTCTCGGCATGGACGAGCTGTACAAGTCCGGACrCAGATCTACGTCAAATGTAGAArrGATACCAATC TACACGGATTGGGCCAATCGGCACCTTT: 
AGrGAGAGCCGTACCTGCTCGACATGTTCAGGCCrGAGTCTAGATGCAGTTrACATCTTAACTATGGTTAGATGrGCCTAACCCGGTTAGCCGrGGAAAc 



-eGFP.C.e.unc53 



-C.e. uncS3 



: r I G >\ 0 £ <_ y ■< S G L R S T S N VELCPIYTDVAN3HL* 

Nru I EcoR I 

I i 
'.i^AGGGC AGC T T aTC AAAG TCP AT TaGGGATaTTTCC AA TG A ff T TCGCG AC TA TCG AC TGG TT TC 7C AGC 7TA77AAT373A7CGTTCC3A7C AACGAA 

C " 'CCCGTCGAA7AGT T7C AGC 7aa7C:CTa7AAAGG" TACT AAAAGCGC 7GA TAGC 7GACC AAAGAG7CG AA TA A T 7AC AC fAGCAAGGC 7A3T7GC T : 



-eGFP.C e urc53 



— " — Ce. uncS3 — — — ■ 

» g > «- -5 * ■ - 0 : ■ :i D F R 0 f R I . V i ) U I ;i / I v :* | n F. 



WO 98/24810 1 5Q PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 p agQ J 
ftq 30 P6GFP72 [1 >9697) Site and Sequence * 

Bsm I 

T 7C TCGCCTGCATTCACGAAACG TTTGGCAAAAATCACATCGAACC TGGATGGCCTCGAAACGTGTC TCGACTACCTGAAAAAT C TGGG TCTCGAC TGC T 
AAGAGCGGACGTAAGTGCTTTGCAAACCGTTTTTAGTGTAGCTTGGACCTACCGGAGCTTTGCACAGAGCTGArGGACTTTTTAGACCCAGAGCTGACGA ICC, " 



- eGFP.C. e.unc53 



-C o. unc53 



rSPAFTKftLAX: [ TSNLOGLE TCLOYlKNLGLDC 

Ear I 

poR V p/u II |<sp632l ^(tnd III 

CGAAACTCACCAAAACCGATATCGACAGCGGAAACTTGGGTGCAGTTCTCCAGCTGCTCTTCCTGCTCTCCACCTACAA GCAGAAGCTTCGGCAAC TGAA 
GCTTTGAGTGGTTTTGGCTATAGCTGTCGCCTTTGAACCCACGTCAAGAGGTCGACGAGAAGGACGAGAGGTGGATGTTCGTCTTCGAAGCCGTTGACTT ' 



-eGFP.C.e.unc53 



-C.e. unc53 



SKLTKTOIOSGMtGAVLOLLFLLSTYKOlCLROLi: 

PmaCI 

jSst II ! PmaCI 

AAAAGATCAGAAGAAATTCGAGCAACTACCCACATCCATrATGCCACCCGCGGTTTCTAAATrACCCTCGCCACGTGTCGCCACGrCAGCAACCGCTTCA 

1 1 iii i i > i ■ t i i i | j^rv 

TTTTC TAGTCTTCTTTAACCTCGTTGATGGGTGTAGGTAATACGGTGGGCGCCAAAGATTTAATGGGAGCGGTGCACAGCGGTGCAGTCGTTGGCGAAGT 



•eGFP.C.e.unc53 



-C.e. unc53 



K 0 Q * K L E Q LPTSIMPPAVSKLPSPRVATSATA3 

GCAAC TAACCCAAATTCCAACTT TCC AC AAAtGTC AACA TCC AGGC TTCAGACTCCACAG tc AAGAATATCGAAAATTGATTCATCAAAGATTGGTATCA 
CGTTGATTGGGTTTAAGGTTGAAAGGTGTTTACAGTTGTAGGrcCGAAGTCTGAGGTGTCAGTTCTTATAGCTTTTAACTAAGTAGTTTCTAACCATAGT 



-eGFP.C e.unc53 



-C.e. unc53 



A rN P,\SNFPQ M3T S RLQTPQSRISKI0SSKIG1 
Aat II 

A«j-C AAAGACG TC TGGACT TAAACC ACCC TCATC A TC 4ACC ACT TC A TC A A AT AA TACAAAT TCAT TCCGTCCG7CGAGCC3 TrCGAGTGGCAATAATAA 



:3GTTTCTGCAGACCTGAATTTGGTG3GAGTA3'A3 7TGGTGAAGTAGTTTATTATGTTTAAGTAAGGCAGGCAGCTCGGCAAGCTCACCGTTATTATT 



- eGFP.C. e.unc53 



" ~ — - — C.e. unc53 

)<:>KT SGL<?3SSS TTSSNNTMSFRP SSSSSGMNN 



WO 98/24810 160 PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 p age ^ 
fjq30p£GFP72 (1 >9697) Site and Sequence _ 

Ear I 

EcoR V Ksp632l Asu II 



I 



j 

TGTTGGCTCGACGATATCCACATCTGCGAAGAGCTTAGAATCATCATCAACGTACACCTCTArTTCGAATC TAAACCGACC T ACCTCCCAAC TCCAAAA* 
ACAACCGAGCTGCTATAGGTGTAGACGC TTCTCGAATCTTAGTAGTAGTTGCATGTCGAGATAAAGCTTAGATrTGGCTGGATGGAGGGTTGAGGTTTTT 



-eGFP.C.e.unc53 



-C.e. unc53 



VGST ISTSAKSLESSSTYSSISNLNRPTSQLQK 

Xba I Nhe I 

I I 
CCTTC TAGACCACAAACCC AGC TAGTTC5TGTT3C TACA ACTAC AAAAATCGGAA GCTCAAAGC TAGCCGC TCCGAAAGCCG TGAGCACCCCAAAACTTG 

GGAAGATCTGGrGTTTGGGTCGATCAAGCACAACGATGrTGATGTTTTTAGCCTTCGAGTTTCGATCGGCGAGGCTTTCGGCACTCGTGGGGTTTTGAAC ~~ 



-eGFP.C.e.unc53 



-C.e. unc53 



PSSPOTOLVRVATTTK I GSSXLAAPKAVSTPKL 

Bsm I 

CrTCrGTGAAGACTATrGGAGCAAAACAAGAGCCCGATAACAGCGGTGGTGGTGGTGGTGGAATGCrGAAATTAAAGTTATTCAG TAGC AAAAACCCATC 

1 III! I I ) I > i I I | I | | , , 

GAAGACACTTCTGATAACCTCGTTTTGTTCTCGGGCTATTGTCGCCACCACCACCACCACCTrACGACTTTAA TTTCAATAAGTC ATCG TTTTTGGG7AG 



-eGFP.C.e unc53 



-C.e. unc53 



A S V K T 1 6AKQ£PDNSGCGGGCMLKUKLrSSKNP3 
T~ZC TCATCGAATAGCCCACAACC T ACGAGAAAGCCGGCGGCGG TGCC TC AACAACAAACTT TG TCGAAAATCGCTGCCCCAG TG AAAAGTGGCCT jAAo 

1 ' ■ ' ' ' ' i 2--W-; 

AAGGA3TAGCTTATCGGGTGT TGGATGC rCTTTCCGCCGCCGCC ACGG AGTTGTTGTTTGAAACAGCTTTTAGCGACGGGGrC AC TTTTCACCGGACTTC 



- eGFP.C.e. unc53 



•C.e. unc53 



5S5NSP0PTRKAAAVPQ00Tl.SK I AAPVKSGLI 
JBstX I Hind 111 

i ! 

ccgccgaccagtaagc tgggaagtgccacgtc ta "*g tcg aagc t tigf acgcc aaaag "f tcc taccgtaaaacggacgcccc aa rcATAr:rcAACAAu 

G'JCGGC FGG TCAT TCGACCC T TCACGG TGCAGA 7ACA3C TTCGAAACA 7GCGG 77TTCAAAGGA7GGCA 7 7TTGCC TGCGGG3 TT AGTA"43AG TTG'Tw 



-eGFP.C e.unc53 



— -C.e. unc53 — — — — — —— — — — — _ 

:53 r3KLGSATS. M SKLC rPlCVSYRKTOA? I 30 0 



WO 98/24810 1 g 1 PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 Page i 

fig 30 pEGFP72 (1 >9697) Site and Sequence ( 

Ear I 

|<sp632i BspM II 

ACTCGAAACGATCCTCAAAGAGCAGTGAAGAAGAGTCCGGATACGCTGGATrCAACAGCACGTCGCCAACGTCATCATCGACGGAAGGTTCCCTAAGCA7 

i iii i ■ i ■ . i ■ i ■ \ i « ■■■ . t rg-y 

rGAGCTTTGCTACGAGTTTCTCGTCACTTCTTCKAGGCCTATGCGACCTAAGTTGTCGrGCAGCGGTTGCAG TAGTAGC TGCCTTCCAAGGGATTCGTh 



-eGFP.C.e.unc53 



-C.e. unc53 



0 S K R C S K SSE cESGY AG FNS TS PTSSS TE G S L S M 

Bsm I 
Sph I 
i Ava III 
| Nsil 

; i 

GCATTCCACATCTTCCAAGAGTTCAACGTCAGACGAAAAGTCTCCGTCATCAGACGATCTTACTCTTAACGCCTCCATCGTGACAGCTA7CAGACAGCCG 

, 1 ■ i . 1 ' ' ' ■ 1 ' ' ' ■ »- 27C' 

CGTAAGGTGTAGAAGGTTCTCAAGTTGCAGTCTGCTTTTCAGAGGCAGTAGTCTGCTAGAATGAGAATTGCGGAGGTAGCACTGTCGATAGTCTGTCGGC 



-eGFP.C.e.unc53 



■C.e. unc53 



HSTSSKSSTSOEKSPSSOOLTLNASIVTAIROP 

3spl 

ATAGCCGCAACACCGGTTTCTCCAAATATTATCAACAAGCCTGTTGAGGAAAAACCAACACrGGCAGTGAAAGGAGTGAAAAGCACAGCGAAAAAAGATC 

t ■ 1 i ■ ' ' 1 ' i ■ » 

r^TCGGCGTTGTGGCCAAAGAGGTTTATAATAGTTGTTCGGACAACTCCTTTTTGGTTGTGACCGTCACTTTCCTCACTTTTCGTGTCGCTTTTTTCTAG 



- eGF P. C.e.unc53 



-C.e. unc53 



:aatpvspni i n k p v e ekptlavkgvkstakko 
PmaCI 

Pvu II ! PmaCI EcoR V 

III I 
CACCrCCAGCTGTTCCGCCACGTGACACCCAGCCAACAATCGGAGTTGTTAGTCCAATTATGGCACATAAGAAGTTGACAAArGACCCCGTGATATCTGK 

G "GGAGGTCGACAAGGCGG rGCACTG TGGGTCGG7TGT T AGCCTCAACAATCAGGTTAAT ACCGTG7ATTC TTCAACTGTTTAC TGGGGCAC TATAGAC7 



■eGFP.C.e.unc53 



-C.e. unc53 



?=>PAVP?aOTQ?T IGVV/SP IMAHKKL T N D P V ISE 

Alwn I 

I 

A AA ACCAGAACC TGAAAAGCTCw AA7CAATGAGCA7C 3 ACACGACGGACG TTCCACCGCT TCCACC "C 7AAAA 7CAG7TGT fCC&CT TAAA A TGAC FTC A 
""^GjTCrrGGACTTTTCGAGGTrAGTTACTCj • A3:T3TGCTGCCrGCAAGG7GGCGAAGGTGGAGATTTTAGTCAACAAGGTGAArTrrACTGAAG" 



-eGFP.C.e.unc53 



C e. unc53 

•*' p £ P E K. L 0 S M 5 ! 3 T 7 0 V P P L P P L K* 5 V V P L < M T 5 



WO 98/24810 1 g 2 PCTYEP97/06956 



Tuesday. 18 November 1997 10:34 p age y 

tig 30 pEGFP72 (1 > 9697) Site and Sequence 

Spll 

ir 

ATCCGACAACCACCAACGTACGATGTTCTTC TAAAACAAGGAAAAATCACATCGCCTGTCAAGTCGTTTGGATATGAG CAGrCGTCCGCGTCTGAAGACT 
rAGGCTGTTGGTGGTrGCArGCTACAAGAAGATTTTGTTCCTTTTTAGTGrAGCGGACAGTTCAGCAAACCTATACTCGTCAGCAGGCGCAGACrTCTGH ^'^ 



-eGFP.Ce.ur.c53 



-C.e. uncS3 



I ROPPTYDVLLKGGK I TSPVKSFGYEQSSASEO 

CCATTGTGGCTCATGCGTCGGCrCAGGTGACTCCGCCGACAAAAACTTCTGGTAATCATTCGCTGGAGAGAAGGATGGGAAAGAA 7 A AG AC A TC AG AA TC 

111 i ■ < i i i i i i . i, ■ 7 J2C»' 

GGTAACACCGAGTACGCAGCCGAGTCCACTGAGGCGGCTGTTTTTGAAGACCATTAGTAAGCGACCTCrCTTCCTACCCTTTCTTATTCTGTAGTCTTA^ 



-eGFP.C.e.unc53 



-C.e. uncS3 



SI VAHASAOVTPP TKTSGNHSLERRMGKNKTSEo 
CAGCGGCTACACCTCTGACGCCGGTGTTGCGATGTGCGCCAAAATGAGGGAGAAGCTGAAAGAATACGATGACATGACTCGTCGAGCACAGAACGGCTAT 

■ *■ ' 1 1 yyy. 

GrCGCCGATGTGGAGACTGCGGCCACAACGCTACACGCGGTTTTACTCCCrCTTCGACTTTCrTATGCTACTGTACTGAGCAGCTCGTGTCTTGCCGATH 



-eGFP.C.e.unc53 



-C.e. unc53 



S GY TS OAGVAMCAKMREKLICEYODMTRRAONGY 

Asu I! pst I ppM It 

CC "GACAAC TTCGAAGAC AGTTCC TCC TTGTCGTC TGGAATATCCGAfAAC AACGAGC TCGACGAC ATATCCACGGACGATTTGTCCGGAGTAG ACATGU 
GGACTGTTGAAGCTTC TGTCAAGGAGGAACAGCAGACCTTATAGGCTATTGTTGCTCGAGCTGCTGTATAGGTGCCTGCTAAACAGGCCTCATCTG TACC 



3-;X 



-eGFP.C.e.unc53 



-C.e. unc53 



- 0 N F EC 3 S 5 t 5 5 G I SONNELDO ! S T 0 0 L 5 G V 0 M 

CAACAGTCGCCTCC AAACA TAGCGAC TA TTCCCAC TT 7G TTCGCCATCCCACGTCTTCTrCCTCAAAGCCCCGAGTCCCCAGrCGGTCC TCC AC ATCA«j' 
G' TQ TCAGCGGAGG TTTGT ATCGCTGATAAGGG^GAAAC AAGCGGTaGGG FGCAGAAGAAGGAGTTTCGGGGC TCAGGGGTC AGCCAGGAG3 TGTaGTCm 



-eGFP.C e.unc53 



* — — — C.e. unc53 — — ______ . 

^rVASKHSOYS '- .-VRHPTSSSSfCPRVPSRSSrS 



WO 98/24810 1 g 3 PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 ^aga r 

fig 30 pEGFP72 (1 >9897) Site and Sequence 

Bgl I 



rar I 
r 

SCGCCfr 

GCTAAGAGCTAGAGCTCGTCTTGTCCTCTTACACATGTTrGAAGACAGGGrCACGGCTTGCrCGGTTGCACCGCGGCGACGGTGuAGTTGGAAGCCTGT- 



CGATTCTCGATCrCGAGCAGAACAGGAGAATGTGTACAAACTTCTGTCCCAGTGCCGAACGAGCCAACGTGGCGCCGCTGCCACCrCAACCTTCGGACAA 



-eGFP.C.e.unc53 



-C.e. unc53 



0SRSRAEOENVYKLLSQCRTS0RGAAATSTFGO 

r»a I Spe I 

Sma I Pvu il j Sal I 

CATTCGCTAAGATCCCCGGGATACTCATCCTATTCTCCACACTTATCAGTGTCAGCTGATAAGGACACAATGTCTATGCACrCACAGACTAG TCGACGAC 

GTAAGCGATTCTAGGGGCCCTATGAGTAGGATAAGAGGTGTGAATAGTCACAGTCGACTATTCCTGTGTTACAGATACGTGAGTGTCTGATCAGCTGCT2 



-eGFP.C.e.unc53 



-C.e. unc53 



HSLRSPGYSSYSPHLSVSADKOTHSMHSQTSRR 
CTTCTTCACAAAAACCAAGCTATTCAGGCCAATTTCATTCACTTGATCGTAAATGCCACCTTCAAGAGTTCACATCCACCGAGCACAGAArGGCGGCTCT 

— — i ■ \ ■ i . i , [ ■ i ^ac: 

GAAGAAG TGTTTTTGGTTCGATAAGTCCGGTTAAAGTAAGTCAACTAGCArTTACGGTGGAAGTTCTCAAGTGTAGGTGGCTCGTGTCTTACCGCCGAGA 



-eGFPC.o.unc53 



•C.e. unc53 



P S S 0 K PS YSGQrHSLORICCHLQEFTSTEHRWAAL 

.Bam HI 

CrTGAGCCCGAGACGGGTGCCGAACrCGATGTCGAAATArGATrC TTCAGGATCCTAC TC GGCGCG 7TCCCGAGG TGGAAGC TC T AC TGGTA7C TA fGGA 

GAACTCGGGC tc tgccc ACGGCTTGAGC TACaGCTTTaT AC taagaag tcc taggatgagccgcgcaagggctccaccttcgagatgaccatagatacc " 



-eGFP.C.e.unc53 



- C.e. unc53 



L S P R R V P NSMSKY 0 S S G S Y S A R S R G G S S T G | Y G 

0am HI Nhe I Nde I 

1 1 1 

Ga^ A„G r TCC AACTGC ACAGAC 7ATC CG ATGA AAAA FCCCCCGC ACA r 7C rGCCAAAAGTGAGATGGGATCCCAACTATCACrGGCTAGCACGACAGCA- 
'Z'Z 7GCAAGG rTGACGTGTCTGATAGGC rAC7 77 7 7A33GGGCGTGTAAGACGGTTTTCACTCTACCC7A3GG TTGA7AG TGACC GATCGTGCTGTCGT A 



-eGFP.C e unc53 



" — C.e. unc53 — — — — — — — — — — — — — — 

- r? CLHPLSOE<5PA.HSAKS£. v 1GSOL5LA5 7 T A 



WO 98/24810 PCT7EP97/06956 

164 



Tuasday. 18 November 1997 10:34 p agQ i 

fig 30 p6GFP72 (1 > 9697) Site and Sequence 

PmaCI 

| pnaCI pal I 

ArGGATCTCTCAATGAGAAGTACGAACATGCTATTCGGGACATGGCACGTGACTTGGAGTGTTACAAGAACACrGTCGAC TCACTAACCAAGAAACAGGA 
TACCTAGAGAGTTACTCTTCATGCTTGTACGATAAGCCCTGTACCGTGCACTGAACCTCACAATGTTCTTG TGACAGC TGAGTGATTGGTTCTTTG TCC" 



-eGFP.C.e.ur.c53 



-C.e, unc53 



yGSLNEKYEHAIROMAROLECYKNTVOSLTKKQE 

Ear I 

Hind III pia I |<sp632l 

GAACTATGGAGCATTGTTTGATCTTTTTGAGCAAAAGCTTAGAAAACTCACTCAACACATTGATCGATCCAACTTGAAGCCTGAAGAGGCAATACGA7TC 

1 ' 1 ' ■ I ' i 1 ' > ■ < ■ 1 k 

CTTGATACCTCGTAACAAACTAGAAAAACTCGTTrTCGAATCrTTTGAGTGAGTTGTGTAACTAGCTAGGTTGAACTTCGGACTTCTCCGrTATGCT^AG 



- eGFP.C.e.unc53 



* C.e. unc53 



N Y G A L F D L F € 0 < L R KLTQHIDRSNLKPEEA IRF 
AGGCAGGACATTGCTC ATTTGAGGGATATTAGCAATCATCTTGCATCCAACTCAGCTCATGC TAACGAAGGCGCTGGTGAGCTTCTTCGTCAACCATCTC 

■ ' ■ i i — • ■ «'''*' ' i i i i i L-^y 

TCCGTCCTGTAACGAGTAAAC TCCCTATAATCGTTAGTAGAACGTACGTTGAG TCGACTACGATTGCTTCCGCGACCACTCGAAGAAGCAGTTGGTAGA3 



-eGFP.C.e.unc53 



■C.e. unc53 



°- Q C I AHLRC I SNHLASNSAHANEGAGE LLRQPS 



Ear I 

Pa I pa I |Sst I j<sp632l 

7GGAATC AG T TGCATCCCATCGATCATCGATG TZATCGTCGTCGA AAAGCAGCAAGCAGGAGAAGATCAGC TTGAGC TCGTTTGGCAAGAAC AAGAAGA3 
MC:TTAGTCiiACGrAGGGTAGCTAGTAGCTACAaT AGCAGCAGCTTTTCGTCGTTCGTCCTCTTCTAGTCaAACTCGAGCAAACCGTTCTTGrTCTrCT: 



-eQFP.C.e.unc53 



-C.e. unc53 



li SVASH RSS MS S S S :< S S K Q £ K I 5 L S S F G K N * k S 
3* m Hl Nde I espM II 

I ! ! 

-^'.ATCCCCTCCTCACTC TCCAAG T' C ACCAA3AA3 AAGAACAAGAACT ACGACGAAGC AC AT ATGCCAFCAAT T TCCGGATC TCAAGGAAC TCT TGA, 
GaCC TAQGCGAGGAGTG AGAGG7 TC AaG TGGTTCT TC TTC fTGTTC TTGA TGC TGCT TCG TG TATACGG TAGT TAAAGGCC TAG AG T TCC T fGAGAAC T'J 



- eGFP.C e uncS3 
C.e. unc53 — 



3SL 3 K - TK<KNKNY0EAHM3S ! 3GSC'*J T t 0 



WO 98/24810 PCT/EP97/06956 

1 65 



Tuesday. 18 November 1997 10:34 p a _ Q >j 

fig 30 pEGFP72 (1 >9607) Site and Sequence 

Sst I ApaL I 

AAC ATTGATGTGATTG AGT TGAAGCAAGAGC TCAAAGAACGCGA TAG TGC ACT TTACGAAGTCCGCC TTGACAATCTGGA TC 3 T3CCCGC3AAGTTGAT3 
T TG T AAC r AC ACT A AC TCAACTTCGTTCTCGAGTTTC TTGCGCTATCACGTGAAATGC TTCAGGCGGAAC TGTTAG AC C X AG C ACGGGCGC 7 TC AACTAZ 



-eGFP.C.e unc53 



-C.e. uncS3 



N IOV IELKQELKEROSALYEVRLONL0PAREVO 

TTCTGAGGCAGACAGTGAACAAGTTGAAAACCGAGAACAAGCAATTAAAGAAAGAAGTGGACAAAC TCACCAACGGTCCAGCCAC TCGTGCTTCTTCCC3 
AAGACTCCCTCTGTCACTTGTTCAACTTTTGGCTCTTGTTCGTTAATTTCTTTCTTCACCTGTTTGAGTGGTTGCCAGGTCGGTGAGCAC3AAGAAGGG: 



-eGFP.C.e.unc53 



-C.e. unc53 



V L R E T VN KLK TE N KOL KKEVOKL TNGPATRASSP 

}<spf psrl a.su II 

cgcctcaattccagttatc tacgacgatgagcatgtcta tgatgcagcgtg tagcagtacatcagc tag tcaatcttcgaaacgatcctctggc tgcaac 
gcggagtt^aggtcaatagatgctgctactcgtacagatactacgtcgcacatcgtcatgtagtcgatcagttagaagctttgctaggagaccgacgtt: wc ~* 



-eGFP.C.e unc53 



-C.e. unc53 



A S IP V I Y d oehvy qaacsstsasqsskrssgcm 

Pvu I 

| jHpa I EcoR V 

TC AATCA^GGTTAC TG TAAACGTGGAC ATCGC TGG AGAAATC AG TTCGATCGTTAACCCGGACAA AGAGATAA TCG fAGGATATCTTGCCA^GTCAACC A 
AGfTAGT tccaatgac atttgcacctgtagcgacctctttagtcaagc TAGCAATTGGGCCTGTTTCTC TATT AGCATCC TAT AGAACOGT ACAGTTG'j" 



-eGFP.C.e.ur.c53 



"C.e. unc53 



SlKvrvNVCl ACE ISSIVNPOXEIIVGVLAKST 

Cla I 

i 

0 rc ag rc atgc tgg aaagaca ttgatg t ttc tat re t aggac tatt tgaag re tacc tatcc agaat tga tgtggagc atcaac t tggaa r iga tgctcc 

— -■' * . .. i • i ; i t iii ■ ■ i ■ j - ' 

CA3 TCAGTACGACC TT TC TG7AAC TACAAAGATAAGA 7CC TGATAAAC TTCAGATGGATAGG TC TTAAC TACACC TCGTAGTTCAACC T7A3C T ACGAGC 



-eGFP.C e unc53 



C.e. unc53 — - 

: >SCVKOI0 7S:LGLrEVyLSR!CVEHQLG! OAP 



WO 98/24810 166 PCT/EP97/06956 



Tuesday. 18 November 1997 10:3S Pag© i c 
fig 30 pEGFP72 ( 1 > 9697) Site and Sequence 

Mlu I 

rGArTCTATCCTTGGCTATCAAATTGGTGAACTTCGACGCGTCATTGGAGACTCCACAACCATGATAACCAGCCATCCAACrGACATTCTTACTrcCKA 
i i ■ i > i ■ i ■ i ■ > ■ i ■ i — — h 

ACTAAGArAGGAACCGATAGTTTAACCACTTGAAGCTGCGCAGTAACCTCrGAGGTGTTGGTACTATTGGTCGGTAGGTTGACrGTAAGAATGAAGGAG' 



-eGFP.C e.unc53 



-Co. unc53 



OSlLGYQIGELRRV[GOSTTMirSHPTDlLTSo 

actacaatccgaatgttcatgcacggtgccgcacagagtcgcgtagac agtctggtccttgatatgcttcttccaaagcaaa tgattctccaac tcgtca 

i r l • ■ ... i . , , i . f ■ . ( , i z;-;,^.,- 

tgatgttaggcttacaagtacgtgccacggcgtgtctcagcgcatctgtcagaccaggaactatacgaagaaggtttcgtttactaagaggttgagcag- 



-eGFP.C.9.unc53 



-C.e. unc53 



TT IRNFHH6AAQSRV0SLVL0MLLPKQNILQLV 

jAat II 3srl psrl A.su II 

AGTCAATTTTGACAGAGAGACGTCrGGTGTTAGCTGGAGCAACTGGAATTGGAAAGAGCAAACTGGCGAAGACCCTGGCTGCrTATGTATCTATTCGAAC 

i j I ii| i | i \ i I i I i | > t i | 

TCAGTTAAAACTGTCTCTCTGCAGACCACAATCGACCTCGTTGACCTTAACCTTTCTCGTTTGACCGCTTCTGGGACCGACGAATACATAGATAAGCTTj 



-eGFP.C.e.unc53 



-C.e. unc53 



JCS ILTCRRLVLAGATG tGKSKLAK T L A A Y V S I R T 

Ssm I Xmn I pgl II 

AAA TC AA TCCGAAGAT AGT ATTG TTAATATCAGCA7TCC TGAAAAC AA TAAAGAAGAATTGC TTCAAG7GGAACGACGCC TGGAAAAGATC TTG AGAAGC 
TTTAGTTAGGCTTCTATCATAACAATTATAGTCC TAA33AC TTTTG TTATTTCTTCTTAACGAAGTTCACC TTGC TGCGG ACC TT TTC TAG A AC TCTTCC 



-eGFP.C.e.unc53 



-C.e. unc53 



N Q 3 E D 5 I V N 15 1PENNKEELLQVERRL.EK I L P * 
Ava II! 

Nsi I Xba I 

j | 

toAGAATCATGC AfCGTAA TTC TA3 A 7AAT A 7CCC A A A3 A A TCG AA TTGC A TT TGTTG TATCCG T 7 TT TGC AAA7G TCCC AC T ]*C AAAACAACo AAGG TC 
7TC TTAGTACG7A3C ATT AAGA 7C 7.* 7Ta 7-GCG ~77C TTAGC TTaACG TAAACAACATAGGC AAAAAC3 TT TAC AGGG TGAAG TT T TG T TGC TTCCAC 



WO 98/24810 167 PCT/EP97/06956 



Tuesday, 18 November 1097 10:35 "Paae If 

fig 30 PEGFP72 (1 > 9697) Site and Sequence 

p»R V 

CATrTCTAGTArGCACAGTCAACCGATATCAAATCCCrGAGCTTCAAATTCAC CACAATrTCAAAArGTCAGTAATGrCGAATCGTCTCGAAGGATTCA" 
GTAAACAfCATACGTGrCAGTTGGCTATAGTTTAGGGACTCGAAGTTTAAGTGGTGTTAAAGrTTrACAGrCArTACAGCrrAGCAGAGCTTCCTAAGTA ^ 



-C.e. unc53 



PFVVCrVNRVQI PEUQ IHHNFKMSVMSNRLEGF I 
» ■ i > ' ■ i ■ i , i . i . 

Ear I 



pst I f<sp632l 



i 

CCTACGTTACCTCCGACGACGGGCGGTAGAGGATGA3TATCGTCTAACTGTACAGATGCCATCAGAGCTC TTCAAAATCATTGACTTCTTCCCAATAGCT- 
GGATGCAArGGAGGCTGCTGCCCGCCATCTCCTACTCATAGCAGATTGACATGTCTACGGTAGTCTCGAGAAGTTTTAGTAACTGAAGAASuGTTATCGA ^ 



-eGFP.C.e.unc53 



-Co. unc53 



I R * L R R R A V £ Q E Y R L T V OMPSELFK I IOFFP Ia 
Ear I 

j<sp632l pcoR I jSph I Bam HI 

CTTCAGGCCGTC AATAATTTTATTG AGAAAACGAATrCTGTTGATG TGACAGTTGGTCCAAGAGCATGCTTGAAC TGTCCTCTAACTGTCG ATGGATCCC 

' ' 1 ' ' I ' ' 1 1 > ' i ■ t ' > I i < *>£V 

GAAGrCCGGCAGTTATTAAAATAACTCTTTTGCTTAAGACAACTACACTGrCAACCAGGTTCTCGTACGAACTTGACAGGAGATTGACAGCrACCTAGGG ' 



-eGFP.C.e.unc53 



-C.e. unc53 



LQA VNNFIEK TNS V DVTVGPRACLNCPLTVOGS 
GT3AATGGrTCATTCGATTGTG3AATGAGAACTTCATTCCATAT TTGGAACGTGTTGCTAGAGATGGCAAAAAAACCTTCGGTCGCTGCA:TTCCTTCGA 

CA2 ttaccaagtaagc taacaccttactcttgaagt aaggtataaaccttgcacaacgatc tctaccgtttttttggaagcc agcgacgtgaaggaagct 



-eGFP.C.e.ur,c53 



-C.e. unc53 



R £ V F ' ^ L V N £ N r I P Y L £ R V AROGKKTFGRCTSFE 

0am HI Jin I Tth , 

5 1 \ 

G ■ j A f C C C A C C G A C A T C G T C rCTAAAAAATGGCC j TG5 ~ TCGA TGG TGAAAACCCGGAGAATG TGC 7CAAACGTCT TCAAC TCC AAGACC TZSTCCGTCA 

CC TAGGG TGGC TC r AGCAG AGAT7TrTTACCGGI ACC AA3C T ACCACTTTTGGGCCTC TfAC ACGAG TTTGCAGAAGTTGAGG TTCTGGAG^AuGGCAGT 



-eGFP.C.e.unc53 



"~ — — — — — q e unC 53 

C ? 7 0 1 V S K < W P V F 0 G E M P E N V L < R L Q L 0 0 L V P 



WO 98/24810 1 6Q PCT/EP97/06956 



Tuesday. 18 November 1997 10:3S Paqe 1 * 

fig 30 pEGFP72 (1 > 9697) Site and Sequence 

BspM I : Xho I Sph ! 

CCTGCCAACTCATCCCGACAACACTTCAATCCCCTCGAGTCGTTGATCCAATTGCATGCrACCAAGCATCAGACCATCGACAACATTTGAACAGAAGACT 
GGACGGTTGAGTAGGGCTGTTGTGAAGTTAGGGGAGCTCAGCAACTAGGTTAACGTACGATGGTTCGTAGfCTGGTAGCTGTTGTAAACTTGTCTTCTGA ^'^ 



>. 



-eGFP.C.ft.uncSO 1 



>. 



-C.e. unc53 I 



PANSSPOHFNPLESL IQLHATKHQTIDN. I TED 



r 



\sp 718 
J<pnl 

CTAArCTTCTCTCGCCrCTCCCCCGCTTTCCTTATCTTCGTACCGGTACCTGATGATTCCCCATTTTCCCCCTTTTCCCCCCAATTTCCCAGAACCTCCT 
GATTAGAAGAGAGCGGAGAGGGGCCGAAAGGAATAGAAGCATGGCCATGGACTACTAAGGGGTAAAAGGGGGAAAAGGGGG6TTAAAGGGTCTTGGAGGA 
SNLLSPLPRFPYLRTGT. .FP tFPLFPP ISQNLL 

Xma I 

| jSma 1 pra t |(mn I 

GTTCCCTTTGTTCCTAGTCCTCCCGGGTGCCGACGCCGAAGCGATTTAAAAACCTTTTTCTTTCCGAAACATTTCCCATTGCTCATTAATAGTCAAATTG 

1 1 1 ' ' 1 1 ' ' i i i i i i i > , , ; i i 

CAAGGGAAACAAGGATCAGGAGGGCCCACGGCTGCGGCTTCGCTAAATTTTrGGAAAAAGAAAGGCTTTGTAAAGGGTAACGAGTAATTATCAGTTTAAC 
FPLFLVLPGAOAEAI . K P F S F R N I S H C S L I V K L 

P(ma I 

I 



Apa I 
Sma I 

II f 



am HI Xba I Bel I 



JBcl I 



AArAAACAGTGTATGTACT TAAAAAAAAAAAAAAAAAAACTCG AGGGGGGGCCCGGGATCCACCGGATCTAGATAACTGATC ATAATCAGCCATACCACA 
TTATTTGTCACATACATGAATTTTTrTTTTTTTTTTTTTGAGCTCCCCCCCGGGCCCTAGGTGGCCTAGATCTATTGACTAGTATTAGTCGGTATGGrGT 
NKQ CflYLK KK KK X LEGGPGIHRI ITDHWQPYH 

Pra I psm I |Hpa I 

T f TGT AG »GGTT T TAC fTGCT TT AAAAAACCTCCC AC ACCTCCCCC TG AACCTGA AAC A fAAAA TGAATGCAA TTGTTGTTG TTAACTTGTTTATTGCA3 

■ ! 1 ■ « ' » ' ■ 1 i y 

AAACATC TCC AAAAfGAACGAAATTTTrTGGAGGGTG TGGAGGGGGAC TTGGACTTTGTATT TTAC TTACG TTAACAACAAC AAf TGAAC AAAfAACGT~ 
iCRGFTC FKX PPTPP PEPET.NECNCCC .LVYCS 

Bsm I 

i 

C TTATAA TGG r TACAAA7A AAGC AATAGCATC AC AAA TT rCACAAAtAAAGCATTTTTTTCACTGC ATTCTAG T TGTGGT TTG TCCAAAC fCAfCAATGT 
GAArArTACCAArGTTrATrrCGTTATCGTAGTGTTTAAAGTGrTTArTTCGTAAAAAAAGTGACGTAAGArCAACACCAAACAGGTTTGAGTAGTrACA 
^ ■ 'J L Q I K Q , HHKF HK SIFF TAF LWFVQTHQC 

M1U I SSQ I 

A-CTTAACGCGTAAATrGTAAGCGrTAArATTTTGTrAAAATTCGCGTTAAATTT TTGTrAAATCAGCTCArTrTTTAACCAArAGGCCGAAArCGGCAA 
* AG AA r TGCGCA TTTAACA ftCGCAATfATAAAAC AAT T TTAAGCGC AAT T TAAAAAC AA 7T fAGTCGAG TAAAAAA X TGG T T ATCCGGC f FTAGCCOTT 
!Lrfi <t .AL!FC.MSR. IFVJCSAMFLTNRPICSA 



WO 98/24810 A rt PCT/EP97/06956 

169 



Tuesday. 18 November 1997 10:35 Pace 1 J 

fig 30 pEGFP72 (1 > 9697) Site and Sequence * 

Bsrl 

AATCCCTTATAAArCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGrCCACTATTAAAGAACGT GGAC TCC AACGTCAAA 
TTAGGGAATATTTAGTTTTCTTATCTGGCTCTATCCCAACTCACAACAAGGTCAAACCTTGTrCTCAGGTGATAATTTCTTGCACCTGAGGTTGCAGTTT ° Cw *' 
K S L I N Q K N R P ft , G VLFQFGTRVHY . RTVTPTS* 

pra 111 

GGGCGAAAAACCGTCTATC AGGGCGATGGCCCACTACGTGAACCATCACCC TAATCAAGTTTTTTGGGG TCGAGGTGCCGTAAAGCACTAAATCGG AACC 
CCCGCTTTTTGGCAGATAGTCCCGCTACCGGGTGATGCACTTGGTAGrGGGATTAGrTCAAAAAACCCCAGCTCCACGGCATTTCGTGATTTAGCCTTGG °*' C '* 
GEK PS I R A MAH YV NHH PNQV FVGRGAVKH . IGT 

|vJae I 

CTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGG: 
' ' < i. i . i . t ■ i ■ i i i , > ■ \ 700'* 

GArTTCCCTCGGGGGCrAAATCTCGAACTGCCCCTTTCGGCCGCTTGCACCGCTCTTTCCTTCCCTTCTTTCGCTTTCCTCGCCCGCGATCCCGCGACCG 
L K G A P 0 L E I 0 G E S R R T VRERKGRKRKERALGRV 

J<«P« Kspl 

A AG TGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGA 
' < 1 I * I « i i l i . i . i i i i , , | 7j^r 

TTCACATCGCCAGrGCGACGCGCATTGGTGGTGTGGGCGGCGCGAATTACGCGGCGATGTCCCGCGCAGTCCACCGTGAAAAGCCCCTTTACACGCGCCT 
QV. RS RC A P R H P P RLNRRYRARQVALFGEMCAE 

Ear I 

pspH I jSsp I |<sp832l 

acccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagt: 

1 ' 1 1 1 1 l ■ ' ■ ' ■ i ' : i it 72C 

TGGGGATAAACAAATAAAAAGATTTATG TAAGT7TATAC ATAGGCGAG TACTCTGTTaTTGGGACTATTTACGAAGTTATTATAACTTTTTCCTTC TCA3 

p t-F VYF SKYIQlC lRS ,ON NPOKCFNN I E K G R V 

?ph I 

^ L Ava Iti 

OxaNI Pvull j |g si , 

CTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG CAGAAGTATGCAAAGCATSCArCTCAAT 
GACTCCGCCTTTCTTGGTCSACACCTTACACACAGTCAArCCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTrCGTACGrAGAGTTA 
LRR KE PAVEC VSV RV VKVPRLPSRQK YAKHASQ 

.Sph I 
I Ava III 
j Nsl I 

rAGTCAGCAACCAGGTGTGGAAAGfCCCCAGGC. CCCCAGCAGGCAGAAGTATGC AAAGCATGCATCTCAATTAGrCAGCAACCATAGTCZCGCCCCTAA 
A7CAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGG3oTCGTCCGTCTTCArACGTTTC3rACGTAGAGTTAATCAGTCGTTGGTATCAG33CGGG3AT-' 
L 7 s N Q V VK V PRLPSROKYAKHASOLVSMHSPAPM 

Bgl I 

psrl Nco I Sti I 

C rnCGCCCA TCCCGCCCCT AACTCCGCCCAGTTCCGCCC ATTCTCCGCCCC ATGGC TGAC TAAT TT T TT 7 r AT fT A TQCAGAGGCCGAGOCCGCCTCGG- 
'jAGGCGGGrAGGGCGGGSA r7GAGGCG33TCAA3«jCG 33 TAAGAGGCGGGG TACCGaC TGAT r AAA a AAA A TAAA TACG7C TCCGGC TCC3 3C33AGCC3 
I* H P A p N3A0FRPFSAPVLrMFFYLC*iGR3«L'.i 



WO 98/24810 PCT/EP97/06956 

170 



Tuesday. 1 8 November 1997 1 0:35 fl 
fig 30 pEGFP72 (1 > 9697) Site and Seance V 95 

Stu I 

j j*vr II pia I 

CTCrGAGCTATTCCAGA&GTAGTGAQGAGGCTTTTTTGGAGGCCTAGGCTTT 

GAGACTCGATAAGGTCTTCATCACTCCTCCGAAAAAACCTCCGGATCCGAAAACGTTTCTAGCTAGTTCTCTGrCCTACTCCTAGCAAAGCGTACTA^CT 
I ; A t P £ V V R R L F V R P R L L Q R S IKRQOEDRFa.l 

0spM I jXma tit 

ACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCrATTCGGCTATGACTGGGCACAACAG ACAATCGGC TGC TCTGATGCCGCCGTG 
TGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTACGGCGGCAC 
MK MQC TQ VLR PLGVP GYSAMTGHNROSAALMPPC 



r 



vjar I 

B° a » jCtpl 
TTCCGGC TGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCA AGACGAGGCAGCGCGGCTArCGTGGC 
AAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTTCTGCTCCGTCGCGCCGATAGCACCG 
_SGCQ R RGARFFISRPTCPVP . MNCKTRQRGYRG 



Bal I | pvu IJ Jth 1 



fspl 

c r 

TbuLCACGACGGGCGTrCCTTGCGCAGCTGTGCTCGACGTTGrCACTGAAGCGGGAAGGGACTGG CTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCT 
ACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGACTTCGCCCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGA 
R R A F L AQL CS T L S L KREG TG CYVAKCRGRI3 

pspM I 

G TC ATC TCACC T TGC TCC TGC CG AG AAAGTATCC A 7CATGGCTGATGCAATGCGGCGGC TGC ATACGCTTGATCCGGCTACC TGC CC A TfCGACCACC AA 
CA5TAGAGTGGAACGAGGACGGCTCrTTCATAG3TAGTACCGACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCTGGTG»3TT 
CH l T lllPR<Y?SVLMQC G GC I R I IRLPAHSTT* 

Ear I 
Ksp632l 

GCGAAACATCGCArCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAG AGCATCAGGGGC TCGCGCC AGCC 3AAC 
CG:TTTGTAGCGTAGCTCGCTCGTGCAT3AGCCTACCTTCGGCCAGAACAGCTAGTCCTACTAGACCTGCTTCTCGTAGTCCCCGAGCGCGGTC3GCrT3 
RNC AS3EHV LGVK PVL SI R MIWTKSIRGSROPfl 

Sph I jg co | 

! | 

. 'j ■ 1"CGCCAGGCTC AA3GC3AGCATGCCCGACG3CGAGGATC TCGTCG TCACCCATGGCG AfGCCTGCT TGCCGAATA TCATGGT GGAAAATGGCC 3C TT 
AC AAGCGGTCCGAG TTCCGCTCG TACGGoC TGCCGC TCC TAGAGCAGCAC TGGGTACCGC TACGGACGA ACGGCTT AT AG T AC C ACC TT TTACCGGCGA A °~ 
CSPGSR » A CP T^? I5S ; PMAMPACR t S V V K M A £ 

Ear I 

Nae I p S r II Ksp632l 

■:r3GATrCArCGACrGTSGCCGGC:G2GTGT:^::;ACCG:TATCAGGACATAGCGTTGGCTACCCGTG ATATTGCT3AA3AG:Tr.:GCGGC2AA-G^ 
«*13 AZC TAAG TAGC TGACACCGGCCGaCCCAC A.CGC Z F3GC 3 A TAG TCC TGTATCGC AACCGA TGGGCAC TA TAACGAC TTC TCGAACCGCCGC T 
f : 0 3 S r V 4 S V V ? T A I R T . R V L P V [ L L < 3 L A A fl <i 



WO 98/24810 171 PCT/EP97/06956 



Tuesday. 18 November 1997 10:35 

fig 30 pEGFP72 (1 > 9897) Site and Sequence Pa S° " 

GCrGACCGCTTCCTCGTGCrTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTrCTArCQCCTTCrTGACGAGrTCrTC TGAGC GGGAC TC TGG3 
CGACTGGCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAAC TGC TCAAGAAGACTCGCCC TGAGACC-" ^'^ 

1 T A . s 3 c F T v s p L p . 1 R s a s p s r afltssser 0Sg " 

^su 11 BspM I 

GTTCGAAATGACCGACCAAGCGACGCCC AACCTGCCATCACGAGATTTCGArTCCACCGCCGCCTTCTATGAAAGGTTGGG CTTCGGAATCGTTTTCCGG 
C AAGC TT TAC TGGC TGG TTCGC TGCGGG TTGG AC GGT AG TGC TC TAAAGC TAAGG TGGCGGCGG AAGATAC TTTCCAACCCG AAGCC TTAGC AAAAGGCC 3 " ^ 
VRM ORPSO AQ PA t TRFRFHR RLL . KVGLRNRFP 

pae I Kspl ^ vr ,| 

GACGCCGGCrGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCTAGGGGGAGGCTAACTGAAACACGGA AGGAGACAATACC^ ' 

CTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGATCCCCCTCCGATTGACTTTGTGCCTTCCTCTGTTATGGCC ^ 
GR RLOOP PAR GS HAGVL RPP . GEAN . N T E G 0 N T G 

J<spl 

AAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATA/UACGCACGGTGTTGGGTCGTTTGTTCATAAACGCGGGGT TCGGTC CCAGGGC TG GC AC TC 
TTCCTTGGGCGCGATACTGCCGTTATTTTTCTGTCTTATTTTGCGTGCCACAACCCAGCAAACAAGTATTTGCGCCCCAAGCCAG6GTCCCGACCGTGAG ^ 
R N P R Y D G N K K T E . N A R C V V V C S . T RGSVPGLAL 

rGTCGATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGC CCAGGGCTCGCA 
ACAGCTATGGGG TGGC TCTGGGGTAACCCCGGTTATGCGGGCGCAAAGAAGGAAAAGGGGTGGGGTGGGGGGTTC AAGCC CACTTCCGGGTCCCGAGCGT ^ 
CRY PTETPUG PIR PR FFL FPTPPPKFG R P R A R 

AJwnl .OxaNi Prai n ra , 

GCCAACGTCGGGGCGGCAGGCCCTGCCATAGCCTC AG oT TAC TC AT ATAT ACT TTAGATTGATTTAAAACTTC AT TTTTA ATTTAAAAGGATCTAGGTG 
CGGTTGCAGCCCCGCCGTCCGGGACGGTATCGGAGTCCAATGAGTATATATGAAATCTAACTAAATTTTGAAGTAAAAATTAAATTTTCCTAGATCCACT ^ 
3QRRGGR PCH SLR LLIYTL0.FKTSFLI.K0L6E 



BspH I 

! 



AGATCCTTTTTGATAArCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGA 
TC TAG GA AAA AC TATTAGAGTAC TGGTTTTAGGG AATTGCAC TCAAAAGC AAGGTGAC TCGCAGTC TGGGGCA TC T TTTCTAGTT TCC TAGAAGAACTCT >:: '" 
DPF - S HDQNPL T , VF VPLSVRP R R K 0 Q R I f L R 

7CCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTG7TTGCCGGATC AAGAGC TACCAAC TC TT TT TCC 3 
AG3AAAAAAAGAC3CGCATTAGACGACGAACGTT7GTTTTTTTGGTGGCGATGGTCGCCACCAAACAAACGGCCTAGTTCTCGATGGTTGAGAAAAAGG: ' ' 
S r F S A R N L L L A H < K T T A T S G G ■ F A G3RATNSFS 

Bsrl 

i 

aaggtaa;tggcttcagcagagcgcagataccaaa-a:t3Tccttctagtgtagccgtagttaggccaccacttcaacaac t c tg tagc AC c GC c T AC A T 

" "-C A r TGACCGAA3TCGTCTCGC5 TC TATGG T7 T a T 3 ACAG3AAGATCAC ATCGGC ATC AA TCCGG TGGTGAAG T TC T TGAG AC ATCGTGGCGGATG TA 
E G N V t QQ S A 0 T < ' C P S S V A V V R P P t 0 E L C 3 T A Y I 



WO 98/24810 172 PCT/EP97/06956 



Tuesday. 13 November 1997 10:35 Paa _ , # 

fig 30 pEGFP72 (1 >9697) Site and Sequence 90 

jUwn 1 

aCCTCGC TCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGAC TCAAGACGATAGTrACCGG A X AAGGCGCA 
rGGAGCGAGACGATTAGGACAATGGTCACCGACGACGGTCACCGCTATTCAGCACAGAATGGCCCAACCTGAGTTCTGCTATCAATGGCCTATTCCGCGT ^ 



PRSANPVTSGCCQVR 



VVSYRVGLKTIVT6.G 



|^paL I 

GCGGTCGGGCTGAAC6GGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCG TGAGCTATGAGAAAGCGCC 
CGCCAGCCCGACTTGCCCCCCAAGCACGTGTGTCGGGTCGAACCTCGCTTGCTGGATGTGGCTTGACTCTATGGATGTCGCACTCGATACTCTrTCGCGG 
A V G I NGGFVH TAQ LG ANOL HRTEIPTA.AMRKR 

ACGC TTCCCGAAG6GAG AA AGGCGG ACAGGTATC CGG TAAGCGGCAGGGTCGGAACAGGAGACCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGG TATc' 
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TGCGAAGGGCTTCCCTCTTTCCGCCTGTCCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCGAAGGTCCCCCTTTGCGGACCATA3 *~ * 
H A S R R £ K G G 0 V S G K R Q G RNRRAHEGASRGKRLVS 

TTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTAT GGAAAAACGCCAGCAACGCGGC 
AAATATCAGGACAGCCCAAAGCGGTGGAGACTGAACTCGCAGCTAAAAACACTACGAGCAGTCCCCCCGCCTCGGATACCTTTTTGCGGTCGTTGCGCC5 ^ 
1 ♦ S . c R V ,S PPLT.ASIFVHLVRGAEPMEKRQQRG 



Ava III 
Nsi I 

CTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTSCTCACAT GTTCrTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCATGCAr 

' 1 ' ' ' '"' ' ' * * " * "" 1 1 ' 1 11 1 1 ' 1 1 • ■ \ ■ 1 1 ii. Q6^' r 

GAAAAATGCCAAGGACCGGAAAACGACCGGAAAACGA3TGTACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGCATAATGGCGGTACGTA 
L F T v PGLLLAFCSHVLSCVIP F C G . PYYRHA 
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A i i'iACCA fGAI* I ACGCCAA3CTTGCA"GCCTGCA6GAATTCGATATCAAGCTTATCGATAGCGTC(JACCT /O 
.'. G AG G ATC AC A AG A A ATTGG AC C A AC 7 ACC C AC A T CCA f fAfGCCACCCGCGG I* f IC PA AG 7GAG 7T TAA I MO 
FT TTGAGTT PACGAC rACAAAAATGTG7TCTTTAATAACTATCT T CGACTTGAG T C TAT"CTGTATCACT 
AGTTGTTGAGTGATTTTTCATTGAGAAAATATTAAAAGGAACA P*ATf*TACTTTGC f TATT 7CCCCTAAC 280 
TT TG A 777AG77 777CSA 7C AAC 7AGA r C77ACAAAAC77GCAA7ACAA7TCC ATT TTC AG AT TACCC T C 3fiO 
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:CCACG7CTCGCCACG~CAGCA4CCGCT7CAGCAAC.: I'AACCC AAA f TCCAAC C I' fCCAG AAA I G I'CAACA U?0 

TCCAGGC 7 T CAGAC7CC ACAG7CAAGA 47A 7CGAAAAT fGG TAAGAAT T T "A fT I' PGAGC r C AAAC I IG r MPO 

iTAAAA 7GCCCA(iAAAAGAAGATGA r AAAAATGTAGTT77777GC AAAAC 7TCCAC CTTTATTGCTCTAA 560 

TATGACG6CTTA7A7C70 AAT777C":3A3TT7TA TC AAA AAA 7 r TfCCAC TA 7 AC AAA TG I AGAAAAG r 630 

All i GC AC AAA I !' l"G 7CAG7 I GACAoC 77TGTAATAGATCCAAATGGAACCTAG ATACAAGC T G77AA AX> 
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AG j (JG AAGGAGCGCAAG I C ' A I' AC I i^GAAA 7AATGATCTGAAACAAATTTGTGCTATTCTC AAATGTTTA /A) 
i*." AC A TG T T TTG A AG A'"T T 7 T T C AAA 7 ?CG C A C TAGT T 7C AG A AC C T TCC T TTTTG T A TG A A A AAG T A AA tV40 
' A A A A AC T A 7 T7 C AAAC C C 7CACC G C C *C C AT G r T TC A AC TC 7 T A A 7 T T 7 T A 7 A A A A 7 7 77 G C £ A77 7 AC 910 
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rc ? x a g a a aa c ac ag a ag ag g c t a a - 7 a a a t a gg g ac ag g t r s t c c c rc: r rc rccc rcc: r rc: rc 1050 
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;.':GCC7CCTCC7CCC'i-rCCArc rCCAACAACAACAAT-TTCCAA7TTCG7TGTCCA-rrrGC rrATAAA 1 l?0 

•;AI ! I'G TG TG i\7iUAAoGAAAC7ACAC3t:G3AoACGG7f AA77AATTCGAA **GAGAGCA I'GGCAA.'* f AC I'C I 130 

77TCGGAAATTGATG#\ATAAAGATACAr.C::GA I'GAC AC TG GC 7GG TAG 7 AG T ATGA G i GTAG NA77GC77 1260 

77TCA7CGTCTCAACT'GCGCA 7GAG V. • 7 CC C GG T C TC ATC AC TG AC A ATT A A TG TC G GG T TT 7 A r G 1330 

:r C7C777CCrA* r rcCGCCAC7CA7'C7Gr,r,T7^.CCArAAAC rGGAA**ACAI " I' I" A C 1' AC T A r J'C AAGCC 14W 
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*X7 T r A7 T T7CA7A 7 f !"AA f I* 1' I 3 T G C A A r T AG G G A T A A AC AC G AC 7 T r TAA A AG TTTATTTAA AAA A AC G H/0 
AT A77 TTCG A77 T 7A AAAiA 'C i G AAAAJTTTC AAAA* A^T.AA fAAA "A C TCCC ! A ACAAATTGTATGGC 10<K) 
t a AAA r I I :AI* f fCTAr TGTTG ACAA"A7CTT TA TA I o FATCAC7GT7TTCC ATCTCAAAACC TTG A A r C 1610 
C '*. C C A AG T T A T A GO A AC C 7C CC: TG TO A f AT 7 7 T C C ATG C T A r G A A I'CG C f AC T C AG C A C A T A T CC A A A A A 1600 
T T AAG C T AG AC G C: T 7 .".1 A T A A 7 r A I' l*Go oC A G Z G T A A T A A AG TGC A A'» j C AC I' I" AG A AT 777 A A TTCAAGC I 75C 
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a.- a • 3 A 7 7 A T C T / ; T T A A A T T C A A V. I* i" TCAAGATTC ACCCAGTTCG 'AC A A f I I" ICC A7GCTT TTTGGCCC '320 
A T TAAAAAAC T " T !\;aCC fCT'C A7CC' T C"CACTCG I' A I'C A 7 A A A A A G 7 A 7 ACC A A AAG C C CG A C f"C t 1 G90 
A^: T777TAAGAijAAGGAGA7AC"GAGCC!AC A'"GGCti ■ GTGACCCT~TTCATC7CGTCCG I" I'CGG TC TC^A lOOC 
AT ACGCTCA TAC i AACTCTTCA** A "A5CCA7 AGACC TCCTTGT r TTC "TC F TCGTTTTGAC TCGCGCC. 2030 

n ri-r ig ."ggctgcctgaaagccgggaaaatttagt atattta^gagc r i'atct ttatqcaatacata 2 iu; 

1 , : 0 2 : W 2 30 2 I Uf. ? I W> 'J 100 2 I TO 

» 1 1 I , , ■ ' ' .... 1 ... i .. .. 1 ■ » » > » ■■»■ > ' -^1-^ * « • ' ■ • • » l - 

/.', \ A ACG AfJ oCAA I r a A A A AT A ? 7 AA T f A A VG AGS T T G T A GA TG T AG A ! i f GG AA A AG A A G A A A A A A ..'I/O 
A.' diAACAAATAGGAACCGCCAGA rCA/.A.\TrCTAT r r AAAGG f r T TC A AG A TCT7 TAGGC A A GAT T ^ G<: 
'.: ; i A AC A C A AAAC TC A A'i TG C C I GC A "iAA7C TAG TGTAACGT T7 AGATTCAACTCGG AAA FCC r AAGCC \<> J 
T" t r. AC TATA CCC t r A ; *C rAGA7C7-Ar.T-GCGCA I'AAGC TCAAGCCC AAGC A GAA ATGAC 7 7 CCA 7 ! I •> ;rJ»J 
!i' I AAGCC rAGAT~GAC T 7CC T 7G'* ' C A i: f C T A A 7C C AG AC r A G A T T 7 C C A AG A G A ( ' ! : ! I CAA I I I » 
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A A A T C T T TC C AG T T TC T T ( i T T AC f T A A A A T C T T AA TGf! CC TG T 6 A TGC G7 A A A ATC G T TAT C C C T T T C TC 2f.«0 
'.'AC AC r I JTAArTACAGATTCATCAAAGATTGGTATCAAGCCAAAGACG rC TGGACr E'AAACCACCC CC 2590 
^7r ATrAACCAC TTCA TC AAA f AA i AC) AAA !' J CATTCCGTCCGTCGAGCCCTTCGAGTGGC AATAATAAT 2b(iO 
-TTGnCTCGACCATATCCACATCTGCGAAGAGCTTAGGTATCCGATCCTTrCGGCTTCTTTTTAGAAATT :>/30 
AT A T TAT T T C AG A A TC A TC A TC A AC GT AC A GC CC T ATT TC G A ATC TAA AC C G ACC T AC C TC CP. AAC Tf.C A ?800 
2810 2820 260O 2040 2850 2860 2870 
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AAAACf* r TC I'AGACCACAAACCCAGCTAGTTCGTGTTGCTACAACTACAAAAATCGGAAGC TCAAAGCTA 211/0 
::C,:GCTCCGAAAGCCGTGAGCACCCCAAAACTTGCTTC rGTGAAGAC TA r TGGAGCAAAAC AAGAGCCCG 29'10 
ATAACACiCGGTGGTGGTGGTGGTGGAATGCTGAAATTAAAGTTATTCAGTAG^AAAAACCCArCTrccrc HO 10 
Art;GAATAGCCCACAACCTACGAiiAAAGSCR(iCG6CGSTGCCTCAA::AACAAACTTTGTCGAAAATCGCr DOIM 
" C C C C AG TG AAA AGTGG C C T G A A G C CC- C C G AC C AG T AA GC TGG G A AG TGC C A C G TC. T A TG T C G A AGC T r T 3160 
3160 3170 3180 3190 3200 3210 3220 

. , . t l i . . . I . ■ . . t ■ ■ . . I , i . I ■ i t i t i i i i I i i i t l -i ti l l .i-Li i i i 1. 

.1TACGCCAAAAGTTTCCTACCGTAAAACGGACGCCCCAA7CATATCTCAACAAGAC TCGAA ACGATGCTC :K20 
V tV'AGCAG TG A AGA AG A3 TCC GG AT ACGC TG C AT TC A AC AGC AC GTCGCC A ACGT CATC A TCGACGGAA 3290 
VTrCCCTAAGCA;TiCATrCCA::A7C T TCCAAGAG7TCAACGTCAGACGAAAAGTCTCXG"CATCAGArG 33G0 
ATCTTArTCTTAACGCCTCCATCGTGACA^CTATCAGACAGCCGATAGCCGCAACACCGGTTTCTCCAAA 3430 
r A T T A TC A AC AA G C C TC T TG AG wi A AA A AC Z AAC AC PG'iCAG I'GAAAGGAG fGAAAASCACAGCGAAAAAA 3500 
3?3IO 3G20 3530 3c 40 3550 3S60 3570 
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urCCACCTCCAGCTGTTCCGCCACG^GACACCCAGCCAACAATCGGAGTTGTTAGTCCAATTATC.GCAC 38/0 
A T A A G £ A G T TG A C A A A ~G AC CC C G TO A TAT C TG AA AAA CC AG A AC C TG A 4 A AGC TC C A ATC A A f G AGC A T 38U0 
" •* : A C ACG AC GG A C G T TCC A C CG :Z T TC C Af; C rc r AA A A T C A S T T G T TCC AC T TAA A A TG AC T TC A A TC C GA 3/10 
2 A AC C AC C A AC G ^ AC G A T G T TC T T C I" A A- AC A AGG A AA AATC AC A TCGCC T GTC A AGTCGTTTSGAT A TG 3 /IK) 
^rAGTCGrCCGCGTCTGAAGACrCCA-TGTGGCTCATGC^aCGGCTCACiGTriACTCCGCCGACAAAAAC 3AS0 

3560 3870 3680 3800 3GC0 WJIO 30X> 
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TCACCATCCrACTCGflCCCG r rCC£4A6S".*6AA6CTCTACT66TA I'A I'GGAGAGACG I I CCAAC I GC 
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A '" rC AGGCACaGACATTCCTCATTTGAGoGA ;"A T TAGCAATCATCTTGCATCCAACTCAGCTC/^ TGCTAAC 49 Hj 
v. A AG GC GC TGG T G AG C T TC TTC G T C AA ^ C 4 IT rCTGGAATCAGTTGCATCCCATCGATCATCGATG TCAT 5040 
d\ fC CiTCG A AAAGC AGC A AG C AGG AO A AG A TC AGC TTGAGC TCCi r f rGGC AAGAACAASAAGAGC I'GGA I r> ! 10 
COXTCCTCACTCTCCAAG I I CACCAASAAGAAGAACAAGAACTACGACGAAGCACA r A TGCCA fC AAT 5180 
."CCGGATCTCAAGGAAC I'C I" l*GACAACATTGATGTGATTGAGTTGAAGCAAGAGC PC AAAG AACGC G A i'A 5250 
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G 7 G C AC T T T AC G A AG PCCGCC I fGACAATCTGGATCGTSCCCGCGAAGTTGATG f I'C I'GAGGGAGACAG f 5320 
UAAUAAla I i b^A AAUl.h *\ At*. AAli L A - r > T T AA AG A AAG AAG TGGAC AAAC TCACCAACGCi rCCAGCCAC I" 3390 
CGTCJCTTCTTCCCGCGCCTCAATTCCAGTTATCTACGACGATGAGCArGTCTATGATGCAGCGrGTAGCA 5U60 
QTACATCAGCTAGTCAATCTTCGAAAC3A f CC TCTGGCTGCAACTCAATC AACGTTAC TGTAAACGT3GA 5530 
i* a i CO*'*. ICG AGAAATCAGTTCGATCGTTAACGGGGACTTG AAGCAGCAGGAATTCTTCCTGGGCTG TAGC hBCX> 

5610 5620 5630 56*40 5650 5660 b6/0 
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AAG'o J'CAG TG'JAAAAG I' fGACTGGAAG-TGCTGGATGAAGCTGTTTTCCAAGTG T CCAACiGAC PA f A f IT 5670 
CTAAAATGG ACCCAGCCTCTACCCTCG3AC TAAGCAC rGACi I'CCA rGCA rGGCTACAGCATCAGCC ACG? H /40 
OAAACGAGTGTTGGATGCAGACCCCCCCGAGATGCCrCCrrGCCGlCGAGGIG I C A A T A AC AT A TC AG TC HU10 
TC CCTCAAAGGTCT6AA6GAGAAA fGCGTCGACAGCCTGCTGTTCGAGACGCTGATCCCCAAGCCGA I'GA 5880 
"•5CAGCACTACA TAAGCC I CC I GC IGA-GC ACC6GCGCCTCGTCCTCTCGGGCCCC AGCGGCACGGGCAA 5950 
5960 5970 59no 5990 6000 60 :0 6020 
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GACCTACCTGACCAATCGCTTGGCCG^C'ACC rGG f G G AG C GC TC T GGCC G TG AG 3 TC AC A G AG GG C A TC £020 
f: r CAGCACC f TCAACA rGCACCAGCAGTCTTGCAAGCATCTGCAACTGTATC r r PCCAACC I AGCCAACC 6000 
■\ ft AT AG ACC G 3 G A A AC A GG A AT T G G GG A 1"G CCC C TCGTG AT 7CTATT3G A TG AC C TGAG rSAAGCASG 6)60 
r " CC ATC AG TG A G TTGG TC aYA T G GG GC CC f CACC"J*Gf AAGTA~CA T AAATGTCCCTA r f A r ACG r ACC 6230 
•*C ; ; AA FCAGCC fG I AAA A A TG AC AC CCA AC C A TGGCT rSCACi IGAGC !* T C AG GATG 7 To A T: C TTC ~ C C A 6300 
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AGCCAAGGATGGGATAAAGC] rcCA i 33ACA3AAAGCrGC7TGGGAGGACCCAG l*3G AATGGGTCCGGGAC 6660 
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GC ICCCATCAGCTA-CT rAGC T:*C TCC*CTCCCCTCTCCTCTTTCAGAGCAC I WC I'C rcCAfcCCCCAGC 7000 
7010 .'020 7030 .W 7050 /<)«) 7070 
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AI'CGGCCGC 7GTCATCAGATCGCC ATCTCGCGCCCGTGCCTC rSACTTC TAAG PCCAA f TAi" rCTTCAAC 7420 
ATCCCTACATGCTCT T fC rCCC i*G I GC rCCCACCCCCTATTTTTGTTATTATCAAAAAAAC TTCTTCTTA /4G0 
A I i ICri'ltl'! r rrrAGCTTCTTTrAAGTCACCTCTAACAATGAAATTGTGTAGATrCAAAAATACiAATT /MX) 
A A i* rCG I'AA rAAAAAGTCGAAAAAAATTGTSCTCCCTCCCCCCATTAATAATAATTCrATCCCAAAATCT /B:tO 
A;:ACAATGT P:TGT(iTACAC I' rC I' I A I C I' CTTTTTTACTTCTGATAAATTTTTTTTGAAAC ATCATAGAA 7700 
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AAAACCGCACACAAAATACCTTATCA T ATGT7 ACG TTTCAG FT TATGACCGCAA"' i fr TA T JTCTTCGCA 7770 
:jTCrGGGCCTCTCATGACGrCAAATCATGCTCATCGTGAAAAAGTTTTGGAGTATTTTTGGAA~TTTC 7&U} 
ii. rCAAGTGAAAGTTTATGAAATTAATTTTCCTGCTTTTGCTTTTTGGGGGTTTCCnCTATTGTTTGTCA /<! !<J 
AG AG r rrCGAGGACGOCG rrrrTC"TnCTAAAATCACAAGTArTGATGAGCACGAT3CAA3AAAGATCGG 7SP.0 
A AGG T TTGG G T 7TG AGGCTC A £~G'.IAAGG 73 AG TAG A AG I* rGA TAA f I" IS AAA I GC*&r, i aG GiCf P.0£0 
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^TGGGGTTTTTGCCTTAAATGACAGAATACATTCCCAATATACCAAACATAACTGTTrCl!TACTACiTr;GTi IMA) 

"! C G T ACG GG CC C T T TCG TC 7 CG C GC G r 7TC GG T G A TGACG G TG A A A AC CTC TG AC A 3 A TG C AG C TC C C G fi iUSO 

-CjACCX* TCACAGC FTG TC TG rAAGCGGAlGCCGGGAGf.AGACAAGCCCGTCAGGGCGCGTC AGCGGG » G V S260 

T^GCGGGTG TCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGA rTGTAC fGAGAG ftiCACCA fA fGCCG 8330 

^TGAAATACCGCACA3ATGCG~AAGGA3AAAATACCGCA fCAGGCGGCC r rAAGSSCC i'CG iGA I ACGC f1<iC0 
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~GC0CGGAACCnCTATTTG T TTA77T r 7CTAAATACATTC AAA "AT3 TATCCGC r C A I'G A3 AC A A S'AACC 0540 
Z TC A T A A ATGC T TC A A T A A~ AT 70 A A A A AG G A A G A G I' A fG AG T A T TC A AC A T 7 TCC S TG TC £ C CC 7 T A TT 8610 
I C 7 T TT ^T GCG GC AT T T TGCC T *C C Tu TT"T T'3C TCACC G AG AA ACGC TGG TGAA AG T A A A A G A~ G C TG UGIX) 
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TC G C C CC G A AG A A C G f : f ICCAA : '?:A I^AGCACTTTTAAAGTTCTGCTATGTGGCGUGG fA f 1A : CC'JG \ 0C2C 
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ag:;g tgacaccacga rccc: r ag c a i ~gg c a ac a acg r r gcg c a a ac t a t t a ac tg g cga a c tac r cac 9 • 70 
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:iC;:CT"CCGGCTCGCTGG :*r rATTGC""C/.TAAATC rGGAGCCGGTfiAGCGTGGCTC TCGCGG TA l-JA |' r G 03 !0 

A " T 3GGG r * AG A TGG rAAfiC C :~C7 3*3 TATCGTfiG f r A T T AC AC G AC G G GG AG r C AG G C AA C T A T IttflO 

."i 5 .i I T G A A C G AAA T A G AC AG A r C 3 C fGAHiT AGGT3CC TCAC I GAT7AAGCA"TGGTAACTG fCAG^CCAA 9450 

SUtfO 9C70 3CfO 9^€0 3500 SiilO -jb'JO 

.... i v . i .... t .... i .... i ... I . i ...■<■■.* I . i i . i i i i 1 1 i i i i i i « 1 1 J, 

3irrACrCATATATACT-TAGA "G A " T"" AAA A3 T TC A I =' T T TAATTTAAAAGGATC TAGC. "G AAGA TCC WX* 
l i i i | .ArAAT;TCATG^;caAM^rcCCTTAAC3TGAGr"TrC6"TCCAC*r.AGCnrCA3ACCCCG!AGA 3wS«J 
AAAoAfCAAAGjATCrTCTrGAGAlXCTTr-TTTC TGCGCCiTAATCTGCTGCTTGCAAACAAAAAAACCA ^360 
•*"-GC TACCACC3G TGG f TG 7 r TC C C C C. A TC A AG A GCT AC C A A 1 T C TT T T TC Z G A A CIO I'AA C I G GC ! ICA y?.K ; 

rv; -'j agcgcaga -accaaatactg re; i *c tag"g t,vs:c • ta3 r 'aggcca::cac r tcaagaac c re r oaio 
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fig PCS501 



Hag© "5 




9830 




9A50 



OflftO Q*70 

III?! I I I I I III I 



AGCACCGCCTACAIACC fCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAG7G3CGA7AAGTCGTG T 9670 
CI I ACCGGG TTGGACTCAAGACGArAG r TACCGGATAAGGCGCAGCGGTCGGGCTG AACGGGGliG f TCG I 99*^ 
GCACACAGCCCAGCrfGGAGCCAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAG 100 iO 
CGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGCTATCCGG fAAGCGGCAGGGTCGoAACAGGAGAGCGC 1COMO 
ACGAGGGAGCTTCCAGGGGGAAACSCCTGGTATCTrTArAGrCCTGTCGGGTTTCGCCACCTCTCACTTG 10 UiO 

10160 10170 10180 m 10190 10200 10210 I0220 

i i ■ i 1 i i ■ * 1 * i i 1 1 i i i i * i i i 1 1 i i i i * i i 1 1 1 i * i 1 * i i * 1 1 i * 1 1 1 1 ■ i i I * i i * * ■ i i i 1 i * i i * 
AGCGTCGAT fTTTG i'GA rGCTCGTCA3GGGGGCGGAGCCTArGGAAAAACGC(!A(5C AACGCKUICC I I 1 it UWXO 
ACGG f FCC I'GGCC TTTTCCTGCCCTTTTGC TC ACA TG f TC IT FCC TGCGTTATCCC CTGATTC 7G7GGAT 10290 
AACCGTATfACCGCCri I oAG I GAGCrGATACCGCTCGCCGCAGCCGAACGACCGAqCGCAGCGAGTCAG lUJbC 
TGAGCGAGGAAGCGGAAGASCSCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATG 10*30 
CAGC i'GGCACGACAGGTTTCCCGACTjGAAAGCGGGCAGrCAGCGCAACGCAATTAATGTGAGrTAGC IX 1C6C0 

10610 10620 10530. 10540 10550 1C560 106/0 

i i i i 1 i i i a t i i i i I i i i i I i i i i t i i t i I i i i i I * i i i 1 * » i i I i i i i j i i t > I i i i i I ji u i I i iii I. 

ACTCATTAGGCACCCCAGGC ITTACA'J I ITA rGCTTCCGGCTCGTATGTTGTGTGG AA T TG I'GAGCGGA f \OK /t) 
AACAA r f TCACAC AGG AAACAGC 1 1C5S3^ 
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!tgl3pCB20l (1 >S082) Site and Sequence Q' i 1, f 

Enzymes: 100 of 146 enzymes (Fettered) <3 ) ' / 4 

Settings: Linear, Certain Sites Only. Standard Genetic Code 

GACGGATCGGGAGATCTCCCCATCCCCTATGGTCGACTCTCAGTACAATCrGCTCTGATGCCGCArAGrTAAGCCAG7ArcrGCTCCCTGCTTGTGT'jT- 

CTGCCTAGCCCTCTAGAGGGCTAGGGGATACCAGCTGAGAGTCATGTTAGACGAGACTACGGCGTATCAArTCGGTCATAGACGAGGGACGAACACACAA ^ 

TORE ISRSPMVDSQYNLL .CR I 7 K P V S A P C L C V 



GGAGGTCGCTGAGTAG TGCGCGAGC AAAATTTAAGCTAC AAC AA GGCAAGGCTTGACCGACAATTGC ATGAAG AATCTGC TTAGGGTTAGGCGTTTTGC3 

CC TCC AGCGAC TCA TCACGCGCTCGTTTTAAATTCGATGfTGTTCCGTTCCGA AC TGGCTGTTA AC GT AC TTCTTAGACGAATCCCAATCCGCAAA AC GC 
GGR VVREONL SYNKARLDROLHE £ 5 A G AF C 



CTGCTTCGCGATGTACGGGCCAGATATACGCGT7GACATTGATTATTGACTAGTTATTAATAGTAATCAA TTACGGGGTCATTAGTTCATAGCCCATATA 
GACGAAGCGCTACATGCCCGGTCTATATGCGCAACTGTAACTAATAACTGATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTA~A~ 
A A S R C T G Q I Y A L T L II 0 , L I I V INYGVISS.PIY 

TGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTC AATAATGACGTATGTTCCCATAGT 
ACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCA 
GVPRYITYGKVPAVL TAQRPPP IDVNNDVCSH5 



AACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCC: 



TTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTGATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGrTCATGCGG2 
N A N R O F P L T S M G G LF TVNCPLG5 T S S V S YAK YA 

CCTATTGACGTCAATGACGG TAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCC TACTTGGCAGTACATCTACGTATTAGTCA 
1 1 I 11 I i I ■ i ■ i i i i | 

GGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGrACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAG" 
P Y ♦ R Q • R . M A 3 ALCP VHDLMGL5 YLAVHLR I S H 

rCGCTAT TACCATGGTGATGCGGyTTiGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACrCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAM 

' 1 ' 1 I » ' ' I 1 I 1 I — 1 i I 7;>". 

AGCGATAATGGTACCACTACGCCAAAACCGTCATGrAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTrCAGAGGrGGGGTAACTGCAGT- 

RYY HGD AVLAVHQ VAVIAV.LTGISKSPPH.RQ 

TG3GAGTTTGTTTTGGC ACCAAAATCAACGGGACT7TCCAAAATGTCG T AACAACTCCG3CCCATrGACGCAAATGGGC3GTAGGCGTGTACGG TGGGAl: 
ACL*CTCAAACAAAACCGTGGTTT7A3"TGCCCT3AAAGGrTT rACAGCATTGTTGAGGCGGGGTAACTGCjTTTACCCGCCATCCG TU 
U 'SF VLAPK 3T GLS K M S . 0 L R P IOANGR . A C T V G 

G7C TATA TAAGC AGAGC TC TC TGGC TaACTAGAGaaCCC ACT GCTTAC TGGCT TATCGAAAT TAAfACGAC TCAC TA7AGGGAGACCCAAGC TGGC TAG: 
CAGATATA7TCG7CTC3AGAGACC3A' TGATC TC 7 *3GG TGACGAATGACCGAA TAGC TT 7AATTA TGC 73AG TGATATCCC TC TGGGTTCGACCGATC^ 

« T7 pfcmoior prtminq ate —J 

0 '- Y * Q S S A H 3 7 H CLLAYRM.YQSL.GDPSVLj* 

G r 7 7AAACTTAAGCrTACCATGGG3 3GrTCTCA~CArCATCATCATCATGGTATGGCTA3CATGACTGGT33ACAGCAAATGGGTCGGGATCTGTACG.ft^ 
C AAAT TT3AA T "CGAATGG TACCCCCCAAGAG TA£ 'A3 TAG TAG TAGTACC ATACCGA TCG TAC TGACC ACCTGTCGTTTaCCCAGCCCTAGACATGC T3 

I -> I 

L— ProBond Oirtdma domain — I » 



ProBond o^tdtna domain 

F< L<LTMG3SH-HHHHGMASMT3GQ0MGRCLYD 
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Page 



GArQACGArAACGTACCTAGGATCCATATGCC TCC TTGCCGTCGAGGTGTCAA TAACATATC AGTC TCCCrCAAAGGrCTGAAGGAGAAArGCGTCGAC* 
C^ACrGCTATTCCATGGATC CTAGGTATACGGAGGAACGGCAGCTCCACA3TTATrGT ATAGrCAGAGGGAGTrTCCAGACrTCCTCTr7A^GCAG"'G- ' 




ODKVPRIHMPPCRRGVNN 



U4QRF 

SVSLKGLKEKCyo 



GCCTGGTGTTCGAGACGCTGATCCCCAAGCCGATGATGCAGCACTACATAAGCCTCCTGC TGAAGCACCGGC GCCTCGTCCTC TCGGGCCCCAGCCGCA^ 
CGGACCACAAGCTCTGCGACTAGGGGTTCGGCTACTACGTCGTGATGTATTCGGAGGACGACTTCGTGGCCGCG GAGCAGGAGAGCCCGGGGTCGCCGT^ 

- pCB201 insert a U4 




-U4 ORF 



5 I v F E * I 1 P K P * M 0 H V ! S L I L K HRRUVLSGPSGT 

GGGCAAGACCTACCTGACC^TCGCTTGGCCGAGTACCTGGTGGAGCGCTCTGGCCGTGAGG TCACAGAGGGCATCGTCAGCACCTTCAAC ATGCACCAti 
CCCGTTCTG6ATGGACTGGTTAGCGAACCGGCTCATGGACCACCTCGCGAGACCGGCACTCCAGTGTCTCCCGTAGC AGTCGTGGAAGTTGTACGTGGT: 

^8201 insert = U4 



-U4 0RF 



G * T y L T NRLAEYLVERSGREVTEG I V S T F N M H •*■ 



CACirCTTGCAAGGATCTGCAACTGTATCTTTCCAACCTAGCC AACCAGATAoACCGGGA AAC AGGAATTGGGGATGTGCCCC TGGTGATTC TATTG5AT3 
G-CAGAACGTTCCTAGACGTTGACATAGAAAGGTTGGArCGGTTGGTCTATCrGGCCCTrTGTCCTTAACCCCTACACGGGGACCACTAAGATAACCT^: 




~U4 CRF 



3C K0 LQLYL SNLANOIORETGIGOVPLViLLO 



T3AG TCAAGCACGC TCCATC AGTGAGTTGGTC AA7GGGGCCC TCACC TGCAAGTATCATA AATGTCCC TATATTATAGG TACCACC AArCAGCCXi" 
'ZZAZ rc AC TTCGfCCGAGGTAG TC ACTCAACCAG 7 TACCCCGGGAGTGGACGTTCATAG TA TT TaCAGGG ATATAATATCC ATGGTGG AGTCGUAi- 




-U4 CRF 



0 . S E A 



S I S £ L 



6 A I T C K Y H K C P Y i [ G T T 0 P 



TGACACCCAACC ATGGC TTGCAC TTGAGZ r TC AG5ATGT TG ACCT TC TCC AAC AACGTGGAGCC AGCC AA T GGCTTCC TGGTTCGfTACCTSAGu 
" 7TAC TGTGGGT TGC TACCGAACGTGAAC7CGAAG "CCTACAACTGGAA5AGGTTG 7TGCACCTCGG TCGG fTACCGAAGGACCAAGCAATGGACITC^ 

-pC820l insert = U4 




— U4GRF — — 

r M 7PNHGLHL 3 F P M L r F S N fl V E => A M G ? L 



Y L P 
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tig 13 pCB20J (1 >5082) Site and Sequence , _ 

AGGAAGCTGGTAGAGTCAGACAGCGACATCAATGCCAACAAGGAAGAGCTGCTTCGGG7GCTCGAC TGGGTACCCAAGCTGT3G TATCA7C TCCAC ACC " 
TCCTTCGACCATCTCAGTCTGTCGCTGTAGTTACGGTTGTTCCTTCTCGACGAAGCCCACGAGCTGACCCATGGGTTCGACACCArAGTAGAGGTGTGGA *' 



-PCB201 insert =» U4 



-U4 0RF 



RKUVESOSOrNANKEELLRVLOVvPKLVYHLHT 

TCCTTGAGAAGCACAGCACCTCAGACTTC CTCATCG GCCCT TGCTTCTTTCTG TCGTGTCCC ATTGGCATTGAGGACTTCCGGACCTGGTTCAT7GACC-T 
AGGAACTCTTCGTGTCGTGGAGTCTGAAGGAGTAGCCGGGAACGAAGAAAGACAGCACAGGGTAACCGTAACTCCTGAAGGCCTGGACCAAGTAACTGGA 



- pCB201 insert = U4 



-U4 OflF 



F L E K H S T S 0 F L I G P C F F L S C P I G 1 £ 0 F R T V F I D L 
GTGGAACAACTCTATCATTCCCTATCTACAGGAAGGAGCCAAGGATGGGATAAAGGTCCATGGACAGAAAGCTGCTTGGGAGGACCCAGTGGAATG3GT: 



CACCTTGTTGAGATAGTAAGGGATAGATGTCCTTCCrCGGTTCCTACCCTATTTCCAGGTACCTGTCTTTCGACGAACCCTCCTGGGTCACC 7TACCCA3 



•pCB201 insert = U4 



-U4 0RF 



v N M S I I PY LQEGAKDGIKVHGQKAAVEQPVEV V 

CGoGACACACTTCCCT3GCCATCA3CCCAACAAGACCAATCAAAGCTGTACCACCTGCCCCCACCCACCG rGGGCCCTCACAGCArTGCCTCAC:T:CC:; 
GCCCTGTGTGAAGGGACCGGTAGTCGGGTTGTTCTGG TTAGTTTCGACATGGTGGACGGGGGTGGGTGGCACCCGGGAGTGTCGTAACGGAGTGGA3G»»w 



-pCB201 insert =U4 



-U4 0RF 



^ 0 TLPVP3AQQ0QSKLYHLPPPTVGPHS lASPP 
AG3AT AGGACAG 7C AAA3ACAGC ACCCC AAG7 7C 7CTGG AC TCAGATCCTC TGA7GGCCATGCTGC TGAhACT TCAAGAAGC TGCCAAC 7 AC A T "GAG f'J 



TQC TATCw "G7C AGTTTCTG7CG73GGG77CAA3AGACC 7GAG7C7AGCAGAC TACCGGTACCACGAC 77TGAAC 77C 77CGACG3T 7GA73 7A AC T 



-pCB201 insen = U4 



- U4 ORF 



^ORTvk'OSrPSSLOSOPLMAMLLKLQEAANY IE 



TCC AGAfCGAGAAACC A7CC7GGAC CCC AACC 77C AGGC AAC AC T7 7AAGGG77CGGC AA IXACTG 7CACCCCCGGACAGCAGAACGC~GGCaT:A3C 7A 
AG3TCTA3CTC777GGTAGGACC733GGT7GGAAG7:CGrTGTGAAAr7CCCAAGCCG7TAG7GACAGTGGGGGCCTGTCGTC7TGCG.HCCG y A37:G^ : 



-pCB20l insert = U4 



— U4CFF 

p 0 B ETILC'HLCATL GFGNHCHPR T A E R 'V -i <j L 
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fiq13pC820l (1>S082) Site and Sequence 

TCTTAGC TCCTCCTCTCCCCTCTCCTCTTTCAGAGCACTGGCTCTCCAGCCCCAGGAGGAGAACAGGAGGG AGGAGGAGATGAAAGAGGA GGGACAGGT r 
AGAAfCGAGGAGGAGAGGGGAGAGGAGAAAGTCTCGTGACCGAGAGGTCGGGGTCCTCCTCTTGTCCTCCCTCCTCCTCTACTTTCTCCTCCCTGTCCAA 



-pCB201 insert 3 U4 



S -L LlSPL LF QS T GSPAPGGEQEGGGOERGGTG 

C TTGGTGCrGTACCTTTGAGAACTTCCTAGGAAGGAATGGTGGGGTGGCGTTTGGGAACTTG TG CCCCCTAAACACATTTAC TGGCCTCCTC TAATGAC 7 
GAACCACGACATGGAAACTCTTGAAGGATCCTTCCTTACCACCCCACCGCAAACCCTTGAACACGGGGGATTTGTGTAAA TG ACC GG AGG AG A T TAC TG A 



- DCB201 Insert « U4 



SVCCfFENFLGRNGGVAFGNLCPLNTFTGLL 



TTGGGGAAAAGATGATTCTGGGTCTTTCCC TTGAC TTC T TGTTTC A AT TAC AAACTCCTGGGCTTTCTGGGGAGGGGTT C AG AAA AC A TC A AAAC AC TG> 
AACCCCTTTTCTACTAAGACCCAGAAAGGGAACTGAAGAACAAAGTTAATGTTTGAGGACCCGAAAGACCCCTCCCCAAGTCTTTTGTAGTTTTGTGAC3 



-pCB201 insert = U4 



v G K 0 0 S G S F P . L L V S 1 T H SVAFVGGVQK TSK HC 

AGCAGTTCCTAAATGATTCTCACAAGCAACCCTGAGAGAGACAGTCTTGTGAGGGAGATCTGGGGGAG GCAGGAAGCTCCTCAGATTTTCTCACAGACCC 
TCGTCAAGGATTTACTAAGAGTGTTCGTTGGGACTCTCTCTGTCAGAACACTCCCTCTAGACCCCCTCCGTCCTTCGAGGAGTCTAAAAGAGTGTCTGG^ 



-pCB201 insert ■ U4 



SSS.M1LT5NPERDS LVREIVGRQEAPQ IFSQT 

T TC CC AATTCCATC ACC AC TGCCAAC AC TCGTCCGGAAT TCTGCAGATATCCAGCACA GTGGCGGCCGCTCGAGTC TAGAGGGCCCGTTTAAACCCGCTG 
AAGGGTTAAGGTAGTGGTGACGGTTGTGAGCAG3CCTTAAGACGTCTATAG3TCGTGTCACCGCCGGCGAGCTCAGArCTCCCGGGCAAATTTGGGCGA- " " 

pCB201 insert a U4 ' 

I P N 3 I T T ANTR^EFCR YPAQVRPtESRGPV TR 

ATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTrGTTTGCCCCTCCCCCG TGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTKC 
TAGTCGGAGCTGACACGGAAGATCAACGGTCGGTAGACAACAAACGGGGAGGGGGCACGGAAGGAACTGGGACCrTCCACGGTGAGGGTGACAGGAAAGG 
3AS TVP S3CQPSV V CPSPVPSLTLEGATPTVL S 

TAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCA AGGGGGAGGATTGGGAAGACH 
A TTAT TT TACTCCTTTAACGTAGCGTAACAGAC ~C ATCC ACAGTAAGATAA3ACCCCCCACCCCACCCCG TCC TGTCGTTCCCCC TCCTAACCC TTCTG* 
■ N , EE ^ASHC LS^ CH S ILGG GV GOOSKGEQVEO 

ATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGA ACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCT5TAGCGGCG: 
rATCGTCCGTACGACCCCTACGCCACCCGAGATACCGAAGACTCCGCCTTTCTTGGTCGACCCCGAGATCCCCCATAGGGGTGCGCGGGACATCGCCGC^ X ~' 

NS RHA QD AVG SMASE aertsvgsrgyphapcsga 

ATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCC CrAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGC^ 
TAATTCGCGCCGCCCACACCACCAATGCGCGTCGCACTGGCGATGTGAACGGTCGCGGGArCGCGGGCGAGGAAAGCGAAAGAAGGGAAGGAAAGAGCGJ " ' * 
ISA AGV VV TRSVT ATLASAtAPAPFAFFPSFLA 

AQG ttcgccccc tttccccgtcaagctctaaatcggggc atccctttagggftccgatttag TGC T T TaCGGC ACC TCG ACCCC AAAAAAC 7TGAT TAGU 

TGC AAGCGGCCGAAAGGGGCAGTTCGAGATTTAGCCCCGTAGGGAAATCCCAAGGCTAAArCACGAAATGCCGrGGAGCTGGGGTTTTTTGAACTAATC: 
* p AGFPRQALNPG I PLGFRFSALRHLDPK K L 0 
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GTGATuGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGi 

, i » 1 1 ' ' — i 1 ' ' ■ ■ 1- x- :- 

CACTACCAAGTGCATCACCCGGTAGCGGGACTATCTGCCAAAAAGCGGGAAACTGCAACCTCAGGTGCAAGAAATTATCACCTGAGAACAAGGrTTGAC. 
GOGSRSGPSP . . TVFRP L T L £ S T F F N S G L t F Q T G 

A ACAACACTCAACCCTATCrCGGTCTATTCTTTTGATTTATAAGGGATTTrGGGGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTA ACAAAAATTT 
rTGTTGTSAGTrGGGATAGAGCCAGATAAGAAAACTAAATATTCCCTAAAACCCCTAAAGCCGGATAACCAATTTTTTACTCGACTAAATTGTTTTTAAA 
TTLNP I SVYSFOL . G !LG I SAYVLKNEL I Q K F 

AACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAA GTCCCCAGGCTCCCCAGGCAGGCAGAAGTATGCAAAG CATGCATC TCAATTAGT 
TT3CGCTTAATTAAGACACCTTACACACAGTCAATCCCACACCTTTCAGGGGTCCGAGGGGTCCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCA 
NAN.FCGMCVS GVESPOAPQAGRSMQ SMHUN 

CAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCC 

1 1 i i ii » 1 • 1 1 ■ » ' i >3 o: 

GTCGTTG3TCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACG TAGAGTTAATCAGTCGTTGG TATCAGGGCGGGGATTGAG3 
SATRCGKSPGSPAGRSMQSMHLN .SAT IVPPLTP 

GCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCTGCCTCT 

' ' ' ■ « 1 ... i . J- 

CGGGTAGGGCGGGGATTGAGGCGGGTCAAGGCGGGTAAGAGGCGGGGTACCGACTGATTAAAAAAAATAAATACGTCTCCGGCTCCGGCGGAGACGGAGh 

PIPPLTPPSSAHSPPHG.LIFFIYAEAEAASA5 
GAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTCCCGGGAGCTTGTATATCCATTrTCGGATCTGATCAAGAGA 

, i 1 . i > 1 i ' 1 ■ i iao: 

CTCGATAAGGTCTTCATCACTCCTCCGAAAAAACCTCCGGATCCGAAAACGTTTTTCGAGGGCCCTCGAACATATAGGTAAAAGCCTAGACTAGTTCTC* 
ELFQK. GGFFGGLGFCKKLPGAC IS 1 FGSOQE 

CA3GATG AGGATC3TT TCGCATGATTGAAC AAGA~GGATTGC ACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGAC TGGGCAC AACA 

. 1 1 ' — h iiii 1 . 1 1- y$x 

G'CCTAC rCCTAGCAAAGCGTACTAACTTGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTG" 

T 3 .GSFPM IEQOGLHAGSPAAVVERLFGYOVAQO 

G.i:AArC3GCTGCTCT3ATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACT3 

— i 1 » t i ■ ■ < > < i > i ■ i « ■ > w.;c»: 

C"3rrAG:CGACGA3ACTACGGCGGCACAAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCrGGCTGGACAGGCCACGGGACTTACTTGA^ 

7 EGC5DAAVFRLSAQGRPVLFVK TDUSGALNEL 
C A 3 G ACG -G GC AGC GC G GC TA TCGTGGC TGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGC TGCTA' 

. 1 , i 1 , 1 . 1 . 1 1 ► 

G r CCrGC rcCGTCGC3CCGATAGCACCGACCGG"GCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGACTTCGCCCTTCCCTGACCGACGATA 
■-CEAARLSVLATTGVPCAAVLOVVTEAGRDVLL 

"'j3GC3AAGTGCC3353CAGGATCTCCTGTCATC "CACC TTGCTCCTGCCGAGAAAGTaTCC ATCATGGCTGATGCAATGCGGCGGCTGCATaCGC TTG^ 
ACCCGCTTCACGGCCCC3TCCTAGAGGACAGTA3AGTGGAACGAGGACGGCrCTTTCATAGGTAGTACCGACTACGTTACGCCGCCGACGrATGCGAAC* 
l 5 E V P 3 00LLSSHLAPAEKVS IMADAMRRLH T L 0 

TCCGGCTACCTGCCCATTC GACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGArGGAAGCCGGTCTTGTCGATCAGGATGArCTGGACGAA 
AG3CCSA TGGACG33T AAGCTGG TGGTTCGCT T?G TAGCGTAGC TCGC TCGTGCATGAGCCTACCT TCGGCCAGAACAGC TAGTCCTAC TAG ACCTGC T* 
P A r C P FDHOAKHR [ E R A R TRMEAGCV00D3L0E 

G£3C ATC AGGGGCTC3 CGCCAGCCGAAC TGTTC3CCAGGCTC AAGGCGCGC ATGCCCGACGGCGAGGATC TCGTCGTGACCC ATGGCGAT3CC TGC TTG- 
C TAG KCCCGA3C3CGGTCGGC rTGACAAGCGGrCCGAGTTCCGCGCGrACGGGC TGCCGC TCCTAGAGCAGCAC TGGG rACCGC TACGGACGA£C;; 
• H 0 G L APAELFARLKARNPOGEO-LVVTHGOACL 
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CGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGC TA TC AGGACATACCGTTGGCTACCCGTGA 

GCTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGCTGACACCGGCCGACCCACACCGCCTGGCGATAGTCCTGTATCGCAACCGATGGGCACy *" 
PN IMVEN GRF S G F IDCGRLGVAORYQO lALATRO 



TAtTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCG CATCGCCTTCTArCGCCTTCTT 
TTACCCGACT6GCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAA 



ATAACGACTTCTCGAACCGCCGC 



j A E E L G G E tf A D R F L V L Y G I A A P 0 S Q R I A F Y R L L 

GACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACG AGA TT TCG AT TC C ACCGCC GC C TTCT ATG A 
CTGCTCAAGAAGACTC6CCCTGAGACCCCAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACT 
OEF F.AGLVG SK. P TKRRPTCHHEISiPPpp$M 

AAGGTTGGGCTTCGGAATC6TTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGA GTTCTTCGCCCACCCCAACTTGTTTATT 
TTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGGTTGAACAAATAA ^ 
* G V A S E S F SG TPAG . S S SAG I S C V 5 SSPTPTCLL 

GCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAG CATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCA 

cgtcgaatattaccaatgtttatttcgttatcgtagtgtttaaagtgtttatttcgtaaaaaaagtgacgtaagatcaacaccaaacaggtttgagtagt ^ CC 

Q > 1 H V T N K A I A S Q 1 S 0 IKHFFHC ILVVVCPNS3 

atgtatcttatcatgtctgtataccgtcgacctctagctagagcttggcgtaatcatggtcatagctgtttcctgtgtgaaat tgttat^^ 
tacatagaatagtacagacatatggcagctggagatcgatctcgaaccgcattagtaccagtatcgacaaaggacacactttaacaataggcgagtgtta 50C> " 

HYL IMSVY RR PLA RAVRNHGHSCFLCE IV IRSQ 

tccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatg agtgagctaactcacattaattgcgttg 
aggtgtgttgtatgctcggccttcgtatttcacatttcggaccccacggattactcactcgattgagtgtaattaacgcaac 5082 
Fhtt *epea svkpgvpne ansh lrw 
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Claims 

1. A vertebrate protein homologue of an UNC-53 
protein of C. elegans or a functional equivalent, 

5 derivative or bioprecursor thereof, which protein 
comprises an amino acid sequence having a 
statistically significant homology to the amino acid 
sequence of said UNC-53 protein of c. elegans 
illustrated in Figure 2. 

10 

2. A vertebrate protein homologue of an UNC-53 
protein of C. elegans r which protein comprises an 
amino acid sequence having one or more of sequence 
blocks A, B, c, D or E as illustrated in Figure 9a, or 

15 block F in Figure 12a or a sequence having a 
statistically significant homology therewith, 

3. A vertebrate protein homologue of an UNC-53 
protein of C. elegans , which protein comprises an 

2 0 amino acid sequence having one or more of sequence 
blocks A, B, C, D,E or F which differ from those 
blocks of Figure 9a or 12a only in conservative amino 
acid changes. 

2 5 4. A vertebrate protein having an amino acid 

sequence encoded by the nucleotide sequence shown from 
nucleotide positions 1 to 6013 illustrated in Sequence 
ID No. 3. 

3 0 5. A vertebrate protein comprising an amino acid 

sequence which comprises one or more of the prosite 
signatures as illustrated in Figure 28 for each of 
said sequences of homology as claimed in claim 2. 



35 



6. A vertebrate protein comprising an amino acid 
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sequence as claimed in any one of claims 1 to 6 which 
is a human protein or a mouse protein. 

7. A vertebrate protein having an amino acid 
sequence encoded by the nucleotide sequence shown in 
Sequence ID No. 4. 

8. A vertebrate protein homologue according to 
any one of claims 1 to 7 comprising an amino acid 
sequence as shown in Sequence ID No. 1 or an amino 
acid sequence which differs from the amino acid 
sequence shown in Sequence ID No. 1 in one or more 
conservative amino acid changes* 

9. A vertebrate protein homologue according to 
any one of claims l to 7 comprising an amino acid 
sequence as shown in Sequence ID No. 2 or an amino 
acid sequence which differs from the amino acid 
sequence shown in Sequence ID No. 2 in one or more 
conservative amino acid changes. 

10. A cDNA encoding a vertebrate homologue of 
UNC-53 protein of C. eleaans according to any of 
claims 1 to 9 . 

11. A cDNA according to claim 10 comprising a 
sequence of nucleotides encoding an amino acid 
sequence as shown in Sequence ID No. 1 or an amino 
acid sequence which differs from the amino acid 
sequence shown in Sequence ID No. 1 only in one or 
more conservative amino acid changes. 



35 



12. A cDNA according to claim 10 comprising a 
sequence of nucleotides encoding an amino acid 
sequence as shown in Sequence ID No. 2 or an amino 
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acid sequence which differs from the amino acid 
sequence shown in Sequence ID No . 2 only in one or 
more conservative amino acid changes . 

5 13. A cDNA according to any of claims 10 or 11 

which cDNA comprises at least from nucleotide position 
1 to position 6013 of the sequence as shown in 
Sequence ID No. 3. 

10 14. A cDNA according to claim 10 or 12 which 

comprises the nucleotide sequence illustrated in 
Sequence ID No. 4. 

15. A nucleic acid molecule capable of 

15 hybridising to the DNA sequences according to any of 
claims 10 to 14 under high stringency conditions. 

16. A DNA expression vector which comprises a 
cDNA as claimed any of claims 10 to 14. 

20 

17. A vector according to claim 16 which 
comprises a promoter of C . eleaans UNC-53 protein or a 
vertebrate homologue thereof according to any of 
claims 1 to 9. 

25 

18. A vector according to claim 17 wherein said 
promoter sequence is derived from a gene encoding a 
mouse or human homologue of an UNC-53 protein of c. 
elegans. 

30 

19. A vector according to any of claims 16 to 18 
which further comprises a sequence encoding a reporter 
molecule. 

35 20. A vector according to claim 19 wherein said 
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reporter molecule is a fluorophore. 



21. A host cell transformed or transfected with 
the vector of any of claims 16 to 20. 

22. A host cell transformed or transfected with 
the vector of claims 19 or 20. 



23. A host cell according to claims 21 or 22, 
10 which cell comprises a prokaryotic cell such as a 

bacterial cell or a eukaryotic cell such as a fungal , 
an animal, a plant or an insect cell. 



24. A transgenic cell, tissue or organism 

15 comprising a transgene capable of expressing a protein 
according to any of claims 1 to 9. 

25. A transgenic cell, tissue or organism 
according to claim 24 which comprises any of a COS 

20 cell, Hep G2 , MCF-7 cell, N4 mouse neuroblastoma cell, 
a NIH3T3 cell, or colorectal carcinoma or human 
derived cells. 



26. A transgenic cell, tissue or organism 
25 according to claim 24 or 25 wherein said transgene 
comprises a vector according to any of claims 16 to 
20. 



27. A transgenic cell, tissue or organism 
30 according to claim 24 to 26 wherein said transgene 
comprises a vector according to claim 19 or 20. 



35 



28. A transgenic cell, tissue or organism 
according to any of claims 24 to 26 wherein said 
organism comprises any of an insect, a fungus, a non- 
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human mammal, a plant or a nematode worm. 

29. A method of producing a mutant vertebrate 
non-human organism which mutation affects cell 
behaviour or the regulation of cell motility or the 
shape or the direction of cell migration, which method 
comprises inducing a mutation in the wild type gene 
encoding the vertebrate homologue of an UNC-53 

C. eleaans protein. 

30. A vertebrate protein homologue of an UNC-53 
protein of C. eleaans , or a functional equivalent, 
derivative, fragment or bioprecursor thereof, for use 
as a medicament to promote neuronal regeneration, 
revascularisation, wound healing or for treatment of 
chronic neuro-degenerative diseases or acute traumatic 
injuries or fibrotic disease. 

31. A vertebrate protein homologue of an UNC-53 
protein of c. eleaans for use as claimed in claim 30 
wherein said vertebrate human homologue is as claimed 
in any one of claims 1 to 9. 

32. Use of a vertebrate protein homologue of an 
UNC-53 protein of c. eleaans . or a functional 
equivalent, derivative, fragment or bioprecursor 
thereof, in the manufacture of a medicament for 
promoting neuronal regeneration , revascularisation, 
wound healing or for treatment of chronic 
neurodegenerative diseases or acute traumatic injuries 
or fibrotic disease. 



35 



33. Use of a vertebrate protein homologue of 
UNC-53 protein of C. elegans according to claim 32 
wherein said vertebrate protein homologue is as 
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claimed in any one of claims l to 9 . 

34. A pharmaceutical composition comprising a 
vertebrate homologue of an UNC-53 protein of C. 
eleaans . or a functional equivalent, derivative, 
fragment or bioprecursor of said vertebrate protein, 
together with a pharmaceutical ly acceptable carrier, 
diluent or excipient therefor. 

35. A pharmaceutical composition as claimed in 
claim 34 which comprises a vertebrate homologue of an 
UNC-53 protein of C. elegans according to any of 
claims l to 9 . 

36. A nucleic acid sequence encoding a 
vertebrate homologue of an UNC-53 protein of 

C. eleaans or a functional equivalent, fragment, 
derivative or bioprecursor of said vertebrate 
homologue, for use as a medicament. 

37. A nucleic acid sequence according to claim 
36 wherein said sequence is a cDNA sequence as claimed 
in any of claims 10 to 14 or a functional fragment of 
said cDNA sequence. 

38. Use of a nucleic acid sequence encoding a 
vertebrate homologue of an UNC-53 protein of C. 
eleaans or a functional equivalent, fragment, 
derivative or bioprecursor of said vertebrate 
homologue, in the manufacture of a medicament to 
promote neuronal regeneration, revascularisation or 
wound healing, or for treatment of chronic 
neurodegenerative diseases or acute traumatic injuries 
or fibrotic disease. 
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39. Use of a nucleic acid sequence according to 
claim 3 8 wherein said sequence is a cDNA sequence as 
claimed in any of claims 10 to 14 or a functional 
fragment of said nucleic acid sequence. 

40. A pharmaceutical composition comprising a 
nucleic acid sequence according to claim 36 or 37 and 
a pharmaceutical^ acceptable carrier, diluent or 
excipient therefor. 

41. A pharmaceutical composition according to 
claim 40 wherein said nucleic acid sequence is a cDNA 
sequence as claimed in any of claims 10 to 14. 

42. A method of determining whether a compound 
is an inhibitor or enhancer of the regulation of cell 
behaviour, growth, cell shape or motility or the 
direction of cell migration, which method comprises 
contacting said compound with a host cell according to 
claim 21 or 23 or a transgenic cell as claimed in any 
of claims 24 to 27 and screening for a phenotypic 
change in said cell. 

43. A method according to claim 41 which is 
capable of determining whether said compound is an 
inhibitor or an enhancer of the signal transduction 
pathway of said transgenic cell of which said 
vertebrate homologue of an UNC-53 protein or a 
functional equivalent, derivative, fragment or 
bioprecursor of said vertebrate homologue is a 
component or is an inhibitor or an enhancer of a 
parallel or redundant signal transduction pathway in 
said cell. 



35 



44. A method according to claim 43 wherein said 
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method is capable of determining whether said compound 
is an inhibitor or an enhancer of said vertebrate 
homologue of an UNC-53 protein of C. eleaans or a 
functional equivalent, fragment, derivative or 
bioprecursor of said vertebrate homologue. 

45. A method according to any of claims 42 to 44 
wherein said phenotypic change to be screened is a 
change in cell growth, or shape or a change in cell 
motility. 

46. A method according to any of claims 42 to 44 
wherein said phenotypic change to be screened is a 
change in filopodia outgrowth, ruffling behaviour, 
cell adhesion, contact inhibition or the length of 
neurite growth. 

47. A method as claimed in any of claims 42 to 
44 wherein said transgenic cell is an N4 neuroblastoma 
cell and the phenotypic change to the screened is the 
length of neurite growth. 

48. A method as claimed in any of claims 42 to 
44 wherein said transgenic cell is an MCF-7 breast 
carcinoma cell or an NIH3T3 cell and the phenotypic 
change to be screened is the extent of phagokinesis or 
contact inhibition. 

49. A method of determining whether a compound 
is an inhibitor or an enhancer of the regulation of 
cell shape, cell growth or motility or of the 
direction of cell migration, which method comprises 
administering said compound to a transgenic organism 
according to any of claims 24 to 28 or a mutant 
organism produced according to the method of claim 29 
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and screening for a phenotypic change in said 
organism. 

50. A method according to claim 49, wherein said 
5 method is capable of determining whether said compound 
is an inhibitor or enhancer of a protein of the signal 
transduction pathway of said transgenic or mutant 
organisms, of which the vertebrate homologue of UNC-53 
protein of C. eleaans or a functional equivalent, 
10 derivative, fragment or bioprecursor of said 
vertebrate homologue is a component, or is an 
inhibitor or an enhancer of a parallel or redundant 
signal transduction pathway in said cell. 

15 51. A method according to claim 50 wherein said 

method is capable of determining whether said compound 
is an inhibitor or an enhancer of the vertebrate 
homologue of UNC-53 protein itself or a functional 
equivalent, fragment, derivative or bioprecursor of 

20 said vertebrate homologue. 

52. A compound which is identifiable by the 
method according to any one of the claims 42 to 51 as 
an enhancer of the regulation of cell shape, or growth 

25 or motility or the direction of cell migration for use 
as a medicament for promoting neuronal regeneration, 
revascularisation or wound healing or for treatment of 
chronic neurodegenerative diseases or acute traumatic 
injuries or fibrotic disease. 

30 

53. Use of a compound which is identifiable by 
the method according to any one of the claims 42 to 51 
as an enhancer of the regulation of cell shape, or 
growth or motility or the direction of cell migration 

35 in the preparation of medicament for promoting 
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neuronal regeneration, revascularisation or wound 
healing or for treatment of chronic neurodegenerative 
diseases or acute traumatic injuries or fibrotic 
disease. 

54. A pharmaceutical composition comprising a 
compound identified according to the method of any of 
claims 42 to 51 claim and a pharmaceutical^ 
acceptable carrier, diluent or excipient therefor. 

55. A compound which is identifiable by the 
method according to any one of claims 4 2 to 51 as an 
inhibitor of the regulation of cell motility, growth, 
or shape, or the direction of cell migration, for use 
as a medicament for alleviating the spread of disease 
inducing cells or metastasis or loss of contact 
inhibition. 

56. Use of a compound according to claim 55 in 
the manufacture of a medicament for alleviating the 
spread of disease inducing cells or metastasis or loss 
of contact inhibition. 

57. A pharmaceutical composition comprising the 
compound as claimed in claim 55, and a 
pharmaceutical ly acceptable carrier diluent or 
excipient therefor. 

58. A method of determining whether a compound 
is an inhibitor or an enhancer of transcription of a 
gene encoding a vertebrate homologue of UNC-53 protein 
of C. eleaans . which method comprises the steps of (a) 
contacting said compound with a cell according to any 
of claims 22 or 27 and (b) monitoring the level of 
said reporter molecule and comparing the results 
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obtained from said monitoring step with a control 
comprising a cell according to claims 22 or 27, which 
cell has not been contacted with said compound. 

5 59. A method as claimed in claim 58 wherein said 

reporter molecule detected is mRNA or green 
fluorescent protein. 

60. A compound which is identifiable by the 
method according to claims 58 or 59, as an enhancer of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of c. eleaans or a 
functional fragment of said gene, for use in promoting 
neuronal regeneration, revascularisation or wound 
healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 
fibrotic disease. 

61. Use of a compound which is identifiable by 
the method of claims 58 or 59, as an enhancer of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of c. eleaans or a 
functional fragment of said gene, in the manufacture 
of a medicament for promoting neuronal regeneration, 
revascularisation or wound healing, or for treatment 
of chronic neuro-degenerative diseases or acute 
traumatic injuries or fibrotic disease. 

62. A pharmaceutical composition which comprises 
30 the compound of claim 60 and a pharmaceutical^ 

acceptable carrier, diluent or excipient therefor. 

63. A compound which is identifiable by the 
method of claims 58 or 59 as an inhibitor of 

35 transcription of a gene coding for a vertebrate 
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homologue of a UNC-53 protein of C. elegans or a 
functional fragment of said gene for use in 
alleviating the spread of disease inducing cells or 
metastasis or loss of contact inhibition. 

64. Use of a compound which is identifiable by 
the method of claims 58 or 59 as an inhibitor of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of C. elegans or a 
functional fragment of said gene, in the manufacture 
of a medicament for alleviating spread of disease 
inducing cells or metastasis or loss of contact 
inhibition. 

65. A pharmaceutical composition which comprises 
the compound of claim 63 and a pharmaceutical^ 
acceptable carrier, diluent or excipient therefor. 

66. A kit for determining whether a compound is 
an enhancer or an inhibitor of the regulation of cell 
motility, growth or shape or the direction of cell 
migration which kit comprises at least one transgenic 
cell as claimed in any one of claims 22 to 25 to be 
contacted with said compound and at least one cell 
according to claims 21 to 28 to be used as a control 
and means for contacting said compound with one of 
said at least one transgenic cells. 

67. A kit for determining whether a compound is 
an inhibitor or an enhancer of transcription of a 
gene coding for a vertebrate homologue of an UNC-53 
protein of c. eleaans or a functional fragment of said 
gene which kit comprises at least one cell as claimed 
in any one of claims 21 to 25 means for contacting 
said compound with said cells. 



WO 98/24810 PCT7EP97/06956 

- 196 - 



68. A kit for determining whether a compound is 
an enhancer or an inhibitor of the activity of a 
vertebrate homologue of an UNC-53 protein of 

C- eleaans or a functional equivalent, derivative, 
5 fragment or bioprecursor of said vertebrate homologue 
protein, which kit comprises at least, one vertebrate 
mutant non-human organism produced according to the 
method as claimed in claim 29 or a transgenic organism 
as claimed in claims 24 to 28 and a wild type of said 
10 vertebrate mutant organism. 

69. A method of identifying vertebrate 
homologues of an unc-53 gene of c. eleaans or a 
functional fragment thereof, which method comprises 

15 hybridizing to a DNA library a suitable 

oligonucleotide sequence of between 15 to 50 
nucleotides of the nucleic acid sequence encoding unc- 
53 or a functional equivalent, derivative or 
bioprecursor thereof, under appropriate conditions of 

20 stringency to identify genes having statistically 

significant homology with the cDNA according to any of 
claims 10 to 14. 

70. A method of identifying a protein which is 
25 active in the signal transduction pathway of a cell of 

which a vertebrate homologue of an UNC-53 protein of 
C. eleaans or a functional equivalent, fragment or 
bioprecursor of said vertebrate homologue is a 
component, which method comprises: 
30 (a) contacting an extract of said cell with an 

antibody to the vertebrate homologue of the 
UNC-53 protein of c, eleaans or a functional 
equivalent, fragment, derivative or bioprecursor 
of said protein, 
35 (b) identifying the antibody /vertebrate 
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homologue complex, and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein of C. eleaans other than the 
5 antibody, 

71, A method of identifying a further protein 
which is active in the signal transduction pathway of 
a cell of which a vertebrate homologue of an UNC-53 
protein or a functional equivalent, fragment or 
bioprecursor of said UNC-53 protein is a component, 
which method comprises: 

(a) forming an antibody to the first 
identified protein bound to the vertebrate 
homologue of UNC-53 protein of C. eleaans in 
claim 70, 

(b) contacting a cell extract with said 
antibody and identifying the 
antibody/protein complex, 

(c) analysing the complex to identify any 
further protein bound to the first protein 
other than the antibody, and 

(d) optionally repeating steps (a) to (c) 
to identify further proteins in said 
pathway . 

12. A method of identifying a protein which is 
active in the signal transduction pathway of a cell of 
which a vertebrate homologue of an UNC-53 protein of 
30 C. eleaans or a functional equivalent, fragment or 

bioprecursor of said homologue protein is a component, 
which method comprises 

(a) contacting an extract of said cell with 
the vertebrate homologue of an UNC-53 
35 protein of C. eleaans or a functional 
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equivalent, derivative or bioprecursor of 
said vertebrate homologue, 

(b) identifying any vertebrate homologue of 
UNC-53 protein/protein complex formed and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein other than the same 
vertebrate homologue of UNC-53 protein. 

73. A method according to claim 72 which further 
comprises contacting a cell extract with any protein 
identified from step (c) not being the same as the 
vertebrate homologue of UNC-53 protein used and 
repeating steps (b) and (c) so as to identify any 
further protein involved in the signal transduction 
pathway of said cell. 

74. A method of identifying a protein involved 
in the signal transduction pathway of a cell of which 
a vertebrate homologue of an UNC-53 protein of 
elegans is a component which method comprises: 

(a) providing an appropriate host cell 
having a DNA construct comprising a reporter 
gene under the control of a promoter 
regulated by a transcription factor having a 
DNA binding domain and an activating domain, 

(b) expressing in said host cell a first 
hybrid DNA sequence encoding a first fusion 
of a fragment or all of a DNA sequence 
according to any of claims 10 to 14 and 
either said DNA binding domain or the 
activating domain of the transcription 
factor, 

(c) expressing in the host cell at least 
one second hybrid DNA sequence encoding a 
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putative binding protein to be investigated 
together with the DNA binding or activating 
domain of the transcription factor which is 
not incorporated in the first fusion, 
(d) detecting any binding of the protein 
being investigated with a protein according 
to any of claims 1 to 9 by detecting for the 
production of any reporter gene product in 
said host cell. 



75* A protein identified by the method of any 
one of claims 70 to 74 for use as a medicament to 
promote neuronal regeneration, revascularisation or 
wound healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 
fibrotic disease. 

76. Use of a protein identified by the methods 
of any one of claims 70 to 74 in the manufacture of a 
medicament for promoting neuronal regeneration, 
revascularisation or wound healing, or for treatment 
of chronic neurodegenerative diseases or acute 
traumatic injuries or fibrotic disease. 

77. A pharmaceutical composition comprising a 
protein identified by the methods of any one of claims 
70 to 74 and a pharmaceutical^ acceptable carrier, 
diluent, or excipient therefor. 

78. A process for producing a vertebrate 
homologue of an UNC-53 protein of C. eleaans or a 
functional equivalent fragment, derivative or 
bioprecursor of said vertebrate homologue which 
process comprises culturing the cells of any of claims 
21 to 28 and recovering said vertebrate homologue of 
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79. A process for producing a vertebrate 
homologue of an UNC-53 protein of c. eleaans or a 
5 functional equivalent, fragment, derivative or 

bioprecursor of said protein which process comprises 
culturing an insect cell transfected with a 
recombinant Baculovirus vector, said vector comprising 
a DNA insert encoding said vertebrate homologue of 
10 UNC-53 protein or a functional equivalent, fragment or 
bioprecursor of said vertebrate homologue, downstream 
of the Baculovirus polyhedrin promoter, and recovering 
the expressed vertebrate homologue of UNC-53 protein. 

15 80. A nucleotide sequence comprising the 

sequence as shown in figure 15. 



81. A nucleotide sequence comprising the 
sequence as shown in figure 16. 

20 

82. A nucleotide sequence comprising the 
sequence as shown in figure 17. 



83. A method of detecting whether a compound is 
25 an inhibitor or an enhancer of expression of a 

vertebrate homologue of an UNC-53 of C. elegans, or a 
functional equivalent, derivative or fragment of said 
vertebrate homologue which method comprises contacting 
a cell expressing said homologue with said compound 
3 0 and monitoring for a phenotypic change compared to a 
control cell which has not been contacted with said 
compound . 



84. A method according to claim 83 wherein said 
35 cell comprises a cell according to any of claims 21 to 
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85. A method according to claim 83 wherein said 
cell has undergone loss of contact inhibition. 

86. A method according to any of claims 83 to 85 
which is capable of determining whether said compound 
is an inhibitor of expression of said vertebrate 
homologue in which the compound to be tested comprises 
a nucleic acid. 

87. A method according to claim 86 wherein said 
nucleic acid sequence comprises an antisense DNA or 
RNA sequence. 

88. A method according to claim 87 wherein said 
mRNA sequence comprises 3 1 untranslated regions of 
mRNA encoding for said vertebrate homologue. 

89. A method according to any of claims 83 to 85 
wherein said compound to be tested comprises a protein 
having an amino acid sequence potentially suitable for 
inhibiting function of said vertebrate homologue. 

90. A method according to claim 89 wherein said 
protein comprises a protein identified according to 
any of the methods of claims 70 to 74. 

91. A pharmaceutical composition comprising a 
compound identified according to any of claims 83 to 
89 together with a pharmaceutical^ acceptable 
carrier, diluent or excipient therefor. 



35 



92. A nucleic acid sequence identified according 
to the method of any of claims 86 to 88 for use in 
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treatment of loss of contact inhibition or carcinoma 
which is mediated by a vertebrate homologue of an 
UNC-53 protein of c. eleaans or a functional 
equivalent, fragment, derivative or bioprecursor 
5 thereof . 

93. Use of a nucleotide sequence identified 
according to the method of any one of claims 86 to 88 
in the preparation of a medicament for the treatment 

10 of loss of contact inhibition or carcinoma which is 
mediated by a vertebrate homologue of an UNC-53 
protein of C. eleaans or a functional equivalent, 
fragment, derivative or bioprecursor of said 
vertebrate homologue . 

15 

94. A nucleic acid according to claim 92 for use 
in the preparation of a medicament for inhibiting 
expression of a gene coding for a vertebrate homologue 
of an UNC-53 protein of c. eleaans . 

20 

95. A NIH3T3 cell line transfected with pcB201 
and deposited under LMBP Accession No. 1603CB. 

96. A plasmid pCB 201 of Sequence ID No. 10 
25 deposited under LMBP Accession No. LMBP 3594. 

97. A MCF-7 cell line transfected with plasmid 
pCB 201 deposited under LMBP Accession No. LMBP 
1601CB. 

30 

98. An assay for detecting expression of a 
vertebrate homologue of UNC-53 protein of C. elegans 
in a vertebrate cell which assay comprises contacting 
a cell or an extract thereof with an antibody to said 

35 vertebrate homologue, or a functional equivalent, 
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derivative or bioprecursor thereof, which antibody is 
linked to a reporter molecule, removing any unbound 
antibody and monitoring for the presence of said 
reporter molecule. 

99. An assay according to claim 98 wherein said 
reporter molecule is an antibody conjugated with a 
suitable fluorophore or detectable enzyme. 

100. A method for detecting for expression of a 
gene coding for a vertebrate homologue of an UNC-53 
protein of c. eleaans or a functional equivalent, 
derivative, fragment or bioprecursor thereof, which 
method comprises contacting a probe specific for a 
nucleic acid or protein sequence coding for or 
corresponding to said vertebrate homologue or a 
functional equivalent, fragment or bioprecursor 
therefor with a cell extract which probe is linked to 
a reporter and analysing for the presence of said 
reporter. 

101. A method according to claim 100 wherein 
said probe comprises a complimentary sequence to a 
region of mRNA transcribed from said gene encoding 
said vertebrate homologue of UNC-53 protein or a 
functional equivalent, derivative or bioprecursor 
therefor. 

102. A method according to claim 101 wherein 
said complimentary sequence is a 3 ' or 5 1 untranslated 
region of said mRNA. 



35 



103. A method according to claims 100 or 102 
wherein said reporter comprises a radiolabel. 
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104. A method according to claim 100 wherein 
said probe comprises an antibody specific for said 
vertebrate homologue of said UNC-53 protein or a 
functional equivalent, derivative, fragment or 
bioprecursor therefor. 

105. A method according to claim 104 wherein 
said reporter comprises an antibody conjugated with a 
detectable fluorophore or enzyme. 

106. Phage Lambda clone 3b of Sequence ID No. 5 
deposited under Accession No. LMBP 3595. 

107 . A method of determining whether a compound 
is an inhibitor or an enhancer of association of UNC- 
53 or a vertebrate homologue thereof according to any 
of claims to 1 to 9 to microtubules or plus end 
regions thereof, which method comprises: ~ 

(a) contacting said compound with a 
transgenic cell, tissue or organism 
expressing UNC-53 protein or said vertebrate 
homologue and which protein is operably 
linked to a reporter molecule. 

(b) screening for the localisation of said 
reporter molecule as compared to a cell 
according to step (a) which has not been 
contacted with said compound. 

108. A compound identifiable by the method 
according to claim 107. 

109. A compound identifiable by the method 
according to claim 107 as an inhibitor of localisation 
or association of UNC-53 or said vertebrate homologue 
with microtubules or the plus end region thereof for 
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use in alleviating the spread of disease inducing 
cells or metastasis or loss of contact inhibition* 

110. A compound identifiable by the method 
according to claim 107 as an enhancer of association 
of UNC-53 or said vertebrate homologue with 
microtubules or the plus end region thereof, for use 
in promoting neuronal regeneration, revascularisation 
or wound healing, or for treating chronic 
neurodegenerative diseases or acute traumatic injuries 
or fibrotic disease. 

111. A pharmaceutical composition comprising the 
compound according to claims 108 or 109 and a 
pharmaceutical^ acceptable carrier, diluent or 
excipient therefor. 

112. A kit for determining whether a compound is 
an inhibitor or an enhancer of association of UNC-53 
or a vertebrate homologue thereof according to any of 
claims 1 to 9 with microtubules or the plus end 
regions thereof, which kit comprises at least one 
transgenic cell expressing UNC-53 and a reporter 
molecule or a cell according to any of claims 20 to 24 
and at least one cell of the same cell type for use as 
a control and means for contacting said compound with 
one of said at least one transgenic cells. 

113. A composition comprising UNC-53 of Q_u 
eleaans or a vertebrate homologue thereof according to 
any of claims 1 to 9 linked to a compound identified 
as an inhibitor or enhancer of association of UNC-53 
or said vertebrate homologue with microtubules or 
their plus end regions for use in targeting said 
compound to said microtubule or the plus end regions 
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thereof . 

114. A composition according to claim 113 which 
further comprises a cell transformation or 
transfecting agent. 

115. A method of targeting a protein to a cell 
microtubule or the plus end region thereof, which 
method comprises introducing into a host cell, tissue 
or organism a transgene comprising a sequence capable 
of expressing UNC-53 or a vertebrate homologue thereof 
according to any of claims 1 to 9, which sequence is 
operably linked to a sequence encoding said protein to 
be targeted such that a chimeric protein is expressed 
and which results in targeting said protein to said 
microtubule or a plus end region thereof. 

116. A method of identifying a molecule which 
covalently modifies UNC-53 or a vertebrate homologue 
thereof according to any of claims 1 to 9 , which 
method comprises 

a) contacting either an extract from a cell 
expressing UNC-53 or said vertebrate homologue or a 
mixture of enzymes comprising canditate UNC-53 
modifying enzymes in the presence of an indicator of 
covalent modification of a protein, 

b) identifying any covalently modified UNC-53 
protein from step a) , 

c) identifying said molecule involved in said 
modification step . 



35 



117. A method according to claim 112, wherein 
said indicator comprises **-p. 
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118. A method of identifying a compound which 
alleviates or enhances the toxicity of UNC-53 or a 
vertebrate homologue thereof according to any of 
claims 1 to 9 # which method comprises contacting said 
5 compound with a cell, tissue or organism according to 
claim 27, and monitoring for the presence of said 
reporter molecule adjacent said microtubules or the 
plus end regions thereof. 

10 119. Plasmid pLMl of Sequence ID No. 6 deposited 

under Accession No. LMBP 3 7 62. 

120. Plasmid pLM4 of Sequence ID No. 7 deposited 
under Accession No. LMBP 3 7 63. 

15 

121. Plasmid pEGF72 of Sequence ID No. 8 
deposited under Accession No. LMBP 3764. 

122. Plasmid pCB501 of Sequence ID No. 9 
20 deposited under LMBP Accession No. LMBP 3765. 

123. A worm strain comprising a chimeric 
C.elegans human unc-53 gene deposited under LMBP 
Accession No. LMBP-1663CB. 
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124. A vertebrate homologue according to any of 
claims 1 to 3 which is a mouse homologue. 

125. A homologue according to claim 125 having 
the sequence illustrated in Figure 14. 
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ACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGAC TC AAGACGATAG TTACCGGAT AAGGCGCA 

' i i i ■ i 1 i i i i i ii — . i t yz-y 

TGGAGCGAGACG ATTAGGACAATGGTCACCGACGACGGTCACCGCTaTTC AGCACAGAATGGCCCAACCTGAGTTCTGCTAfC AA TGGCCTATTCCGCGT 
P B S A N P V T S G CCQVR VVSYRVGLKT I VTG G A 

fpaL I 

GCGGTCGGGCTGAACGGGGGGTTC6TGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACT GAGATACCTACAGCGTGAGCTATGAGAAAGCGCC 
CGCCAGCCCGACTTGCCCCCCAAGCACGTGTGTCGGGTCGAACCTCGCTTGCTGGATGTGGCTTGACTCTATGGATGTCGCACTCGATACTCTTTCGCGG ^"^ 
A V G t N G G F V H T A Q L G A N 0 L H R T EIPTA.AMRKR 

ACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATC 

• ' 1 ' • — * 1 " ' ' * ■'■ " » " " I '- ™" > ■ ■ ■ I i I ■ I \ , I j ,1 ■ ■ t , , t , ., . .,.,4. .ys.-v 

TGCGAAGGGCTTCCCTCTTTCCGCCTGTCCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCGAAGGTCCCCCTTTGCGGACCATA3 "~ * 
HA SRR SK GGQ VSGKR QGRNRRAHEGASRGKRLVS 

TTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCT ATGGAAAAACGCC AGCAACGCGGC 
AAATATCAGGACAGCCCAAAGCGGTGGAGACTGAACTCGCAGCTAAAAACACTACGAGCAGTCCCCCCGCCTCGGATACCTTTTTGCGGTCGTTGCGCCG 
L • s , C R V S P P L T . A S (FVMLVRGAEPMEKfiQQRG 

Ava III 
psi t 

CTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTA TTACCGCC ATGCAT 
GAAAAATGCCAAGGACCGGAAAACGACCGGAAAACGAGTG TACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGC ATAATGGCGGTACGTA 96 "' 
LFTVPGLLLAFCSHVLSCVIP.FCG.PYYRHA 
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f AGTTATTAATAGrAATCAATTACGGGG TCATTAGTTCAITAGCCCATATATGGAGTTCCGCGTTACATAAC TTACGGTAAATGGCCCGCCTGG C TGACC j 

ATCAArAATTATCATrAGTTAATGCCCCAGTAATCAAGTATCGGGTArATACCTCAAGGCCCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGG: 
^ L I V I N Y G V I S S . P 1 Y G V P R Y I T Y G K V P A V L T 

CCCAACGACCCCCGCCCATrGACGrCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATT GACGTCAATGGGTGGAGTATTTACGG' 
GGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATGCCA ^ 
A Q ^ p P P L_J? V N N D VCSHSNANROFPLTSNGGVFTV 



AAACTGCCCACTTGGCAGTACATCAAGTGTATCAT 



ATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTaAATGGCCCGCCTGGCATTATGCCCAGTa 



TTTGACGGGTGAACCGrCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAArACGGGTC^- 
N C P L G $ f 3 S V S Y A K Y A P Y R Q.R.MARLALCP V* 

CATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTG GCAGTACATCAATGGGCGTGGh 
GTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCT u °° 
HDL MG LSYL AVH LR rSHRYYHGOAVLAVHQVAV 



rAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAA ATCAACGGGACTTTCCAAAATGTCGTA 
ATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTTACAGCA- ?C ° 
{ A V : L T G . j S K S P P H , R Q V £ F V L A P K STGLSKMS. 

AC A AC TCCGCCCCATTGACGCAAATGGGCGGTAGGCGTG TACGGTGGGAGGTC TATATAAGCAGAGC TGG TTTAGTGAACCG TCAGATCCGC TAGCGC TA 
TGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATATATTCGTCTCGACCAAATCACTTGGCAGTCTAGGCGATCGCGA- ^ 
0 > R . P 1 0 A N G R . A C T V G G L Y K QSVFSEPSDPLAL 

CCGGTCGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGG ACGGCGACGTAAACGGCCACAAGrTCAGCc 
GGCCAGCGGTGGTACCACTCGTTCCCGCTCCTCGACAAGTGGCCCCACCACGGGTAGGACCAGCTCGACCTGCCGCTGCATTTGCCGGTGrrCAAGTCGC ^ 



-eGFPC e uncSSsma 



PVA THVSKG EELFTGVVPILVELDGOVNGHKF.? 



TGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGC CCGTGCCCTGGCCCACCCTCGTG^ 
ACAGGCCGCTCCCGCTCCCGCTACGGTGGATGCCGTTCGACTGGGACTTCAAGTAGACGr6G rGGCCGTTCGACGGGCACGGGACCGGGTGGGAGCAC Tli 



- eGFPC.«.unc50ama 



V 5 G E G £ G . 0 A T V GKLTLKF ICTTGKLPVPVPTL 



CACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCA AGTCCGCCATGCCCGAAGGCTACGTCCAGGAc 
GTGGGACTGGATGCCGCACGTCACGAAGTCGGCGArGGGGCTGGTGTACTTCGTCGTGCTGAAGAAGTTCAGGCGGTACGGGCTTCCGATGCAGGTCCT: 



- oGFPC o.unc53>nia 



TLT YGV QCFSR YP DHM KQ HOFFK S A M P £ G Y V 0 E 

CGCACCArcyCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGA^ CGCATCGAGCTGAAGGGCA^CU 
GCGTGGTAGAAGAAGTTCCTGCTGCCGTTGATGTTCTGGGCGCGGC TCCAC TTCAAGCTCCCGCTGTGGGACCACT TGGCGTAGC TCGACTTCCCG fAGC 



* r 1 Ffr 5<00GNYKTRAEVKFEGD 



eG FFC . e. •Jnc52sm3 — — _______ 

r LVriRiELKGI 
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AC TTC AAGGAGGACGGC AACATCCTGGGGCACAAGCTGGAGTACAACrAC AACAGCCACAACGTCTATATC ATGGCCGACA AGCAGAAGAACGGCATCAA 
TGAAGTTCCTCCTGCCGTTGTAGGACCCCGTGTTCGACCTCATGTTGATG TTGTCGGTGTTGCAGATATAG TACCGGCTGTTCGTCTTCTTGCCGTAG T X '' K 



* eGF.-C.e.unc53sma 



DFKEDGN1LGHKIEYNYNSHNV 



Y I M 



0 K Q K N G If 



GGTGAACTTCAAGATCCGCCACAACA 



TCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCG 



ACGGCCCCGTGCTGCTCi 



CCACrrGAAGTTCTAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCCGCTGGTGArGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACG A rr.J- 



2 2C\ 



' - eGFPC 8.uns53sma — 

V N r K 1 R " < E P C S V Q L A 0 H Y Q Q N T P I G 0 G P V L L 

CCCGACAACCAC TACC TGAGCACCCAGTCCGCCC TGAGCAAAGACCCCAACGAGAAGCGCGATCAC ATGGTCC TGCTGGAGTTC GTGACCGCCGCCGGGA 
GGGCTGTTGGTGATGGACTCGTGGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAG TGTACCAGGACGACCTCAAGCACTGGCGGCGGCCCT 



- eGFPC.e.uncSOsma 



P 0 N H Y L S T Q S A L S KDPNEKRDHMVLLEFVT 



A A G 



rCACTCTCGGCATGGACGAGCTGTACAAGTCCGGACTCAGATCTACGTCAAATGTAGAATTGATACCAATCTACACGGAT 



TGGGCCAATCGGCACCTTTC 



agtgagagccgtacctgctcgacatgttcaggcctgagtctagatgcagtttacatcttaactatggttagatgtgcctaacccggttagccgtggaaa'; 



-uGFPC. o. unc533ma 



ITLGMDE'LYKSGLRSTSNV 



-C.e.unc53 sma 



ELIPIYTOVANRHL 



GAAGGGCAGCTTATCAAAG TCGATTAGGGATATTTCC AA TGATTTTCGCGACTATCGACTGGTTTCTCAGC TTATTAATG TGATCGTTCCGATC 4ACGAA 
C TTCCCGTCGAATAGTTTC AGCTAATCCCTATAAAGG TTACTAAAAGCGCTGATAGC TGACC AA AG AGTCG AA TA ATT AC AC TAGCAAGGC TAGTTGC T7 



-C.a.uncS3 sma 



K GS . L SX 3iaDI3N DFR DYRLVSQ L I N V I V ? ( M E 

TTCTCGCCTGCAr rCACGAAACGTTTGGCA^AATCACATCGAACCTGGArGGCCTCGAAACGTGTCTCGACTACCTGAAAAATCTGGGTCTCGACrGCV 
AAGAGCGGACGTAAGTGCTTTGCAAACCGTTTTTAGTGTAGCrTGGACCTACCGGAGCTTTGCACAGAGCTGATGGACTTTTTAGACCCAGAGC TGACGA 



-9GFFC e.'jncSSs.'Tia 



— — — — — ^— — c.e.uncS3 sma ■ 

F S P A F T * R L A * ' TSNLOGLE TCL O VLK.NL G L D C 

CGAAACTCACCAA AACCGATATCGACAGCGGAAACTTGGGTGCAGTTCTCCAGCTGCTCTTCCrGCTCTCCACCTACAAGCAGAAGCTTCGGCAACTGAM 
GCTTrGAGTGGTTTTGGCTATAGCTGTCGCCTTTGAACCCACGTCAAGAGGTCGACGAGAAGGACGAGAGGTGGATGTTCGTCTTCGAAGCCGTTGACTT 



— — C.e.uncS3 sma — — 

3 - L T " T 0 1 0 S S ^ ■- G A V L 0 L I F L L S T Y K Q K L » 0 L ► 
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' AAAAGATCACAAGAAATTGGAGC AAC TACCCACATCC AT TATGCCACCCGCGGTTTCTAAAfTACCC TCGCCACGTGTCGCC ACG TCAGCAACC GC TTCA 
TTTTCTAGTCTTCTTTAACCTCG TTGATGGGTGTAGGTAATACGGTGGGCGCCAAAGATTTAATGGGAGCGGTGCACAGCGG TGCAGTCGTTGGCCA£f;T 



-C.e.unc53 sma 



ICOOKKLEQLPTSI 



MPPAVSKLPSPRV 



A T S A T A S 



GCAACTAACCCAAATTCCAACTTTCCACAAATGTCAACATCCAGGCTTCAGACTCCACAGTCAAGAATATCGAAAATTGATTCATCAAA GATTGGTATr 

cgttgattgggtttaaggttgaaaggtgtttacagttgtaggtccgaagIctgaggtgtcagttcttatagcttttaacIaagtagtttc^ 



TAGT 



-C.e.unc53 sma 



A T N P N S N F P QMS T 5 R LQTPQSRISKIOS 



S K [ G I 



agccaaagacgtctg gacttaaaccaccctcatcatcaaccacttcatcaaataatacaaattcattccgtccgtcgagccgttcgagt ggcaataataa 

rCGGTTTCTGCAGACCTGAATTTGGrGGGAGTAGTAGTTGGTGAAGTAGlTTATTATGTTTAAGTAAGGCAGGCAGCTCG^ 



20CC 



^PKTSGLKPPSSST 



• C.e.unc53 sma 



T S S N N T N S FRPSSRSSGNNN 



TGTTGGC 



TCGACGATATCCACATCTGCGAAGAGCTTAGAATCATCATCAACGTACAGCTCTATTTCGAATCTAAACCGACCTACC 



„ . . _ _ _ TCCCAA C TCCAAAAA 

fCAACCGAGCTGCTATAGCTGTAGACGCTT Cra 

- oG FFC.o. unc53ama 



2 I £C 




-C.e.unc53 sma 



V G 3 T ! S T S A K SLESSS TYSS I SNL.NRP TSQ L Q k 

CCT rc TAGACCAC AAACCCAGCTAGTTCGTGTTGC tacaactacaaaaatcggaagc tcaaagc tagccgc tccgaaagccg tgagcacc cc aaaac~ ts 

GGAAGATCTGGrGTTTGGGTCGATCAAGCAC AACGArGT rGATGTTT TTAGCCTTCGAGTTTCGATCGGCGAGGCTTTrrinr at rrr.Tnrr.r.TTTTr.^ .S- 




p SRPQTQLVR 



-C.e.uncS3 sma 



VATTTKIGSSKL 



AAPKAVSTPk'L 



CrTCTGTGAAGAC rATTGGAGCAAAACAAGAGCCCGATAACAGCGGTGGTGGTGGTGGTGGAATGCTGAAATTAAAGTTATTCAGTAG CAAAAACCCATL- 
GAAGACACTTCTGATAACC TCGTTTTG FTC TCGGGCT AT TG TCGCC ACCACC ACC ACC ACC TTACG AC TTTAATTTCAAfAAG TC ATCG TTTTTGGGTAS 



-C.e.unc53 sma 
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rTCCTCATCGAATAGCCCACAACCTACGAGAAAGGCGGCGGCGGTGCCTCAACAACAAACTTTGTCGAAAATCGCTGCCCCAGTGAAAA GTGGCCTGAAG 
AA2GAGTAGC TTATC GGGTGTTGGATCC TCTTTCCGCCGCCGCCACGGAGTTGTTGTTTGAAACAGC TTTTAGCG ACGGGGTCAC TTTTCACCGGAcTtc 

- eGFPC.e.uncS3sm a 

-C.e.unc53 sma 



3 SS NSP Q P TftK AAAVP QQqti^ S K I A A P VKS G L > 
CC5CCGACCA0TAAGCTGGGAACTGCCACGTCTATGTCGAAGCTTTGTACGCCAAAAGTTTCCTACCGTAAAACGGACGCCCCAATC ATATCTCAACAAG 

ggcggctggtcattcgacccttcacggtgcagatacagcttcgaaacatgcggttttcaaaggatggcaItttgcctgcggggttagtatagagttgtt? 



-C.e.unc53 sma 



? " T S ' L G S A T S " S .« ■- C T P K V S Y « < T 0 A P I r S 0 0 



ACTCGAAACGATGC rCAAAGAGCAGTGAAGAAGAGTCCGGATACGCTGGATTCAACAGCACGTCGCCAACGTCATCATCGACGGAAGG TTCCCTAAGCAT 

tgagctttgctacgagtttctcgtcacttcttctcaggcctatgcgacctaagttgtcgIgcagcggttgcagtagtagctgccttccaagggattcgta 



hoc 




cGFPC.o.unc53sm a 
C.a.unc53 sma 



° S K " C S K S S E E E » « ; * G F N S T S P T S S S T E Q S L S H 

GCATTCCACATCTrCCAAGAGTTCAACGTCA GACGAAAAGTCTCCGTCATCAGACGATCTTACTCTTAACGCCTCCATCGTGACAGCTATC AGACAGCCG 
CGTAAGGTGTAGAAGGTTCTCAAGTTGCAGTCTGCTTTTCAGAGGCAGTAGTCTGCTAG^ATGAGAATTGCGGAGGTAGCACTGicGATAGTCTGTCGGC 



;-GFrC.o.iincS3sma 



- C.o.uncS3 sma 



" S T 3 3 K $ 5 T S 0 E " , » I f S D D L T L N A S . V T A , R Q p 

A-AGCCGCAACACCGGTTTCTCCAAATArTAT CAACAAGCCTGTTGAGGAAAAACCAACACTGGCAGTGAAAGGAGTGAAAAGCA CAGCGAAAAAAGAT- 
TATCGGCGTTGrGGCCAAAGAGGTTTArAATA GTTGTTCGGACAACrcCTTTTTGGTTGTGACCGrCACTTTCCTC ACTTTTCGrGTf!^t"TTTTTTr T^^ 



-tiGFFC u.unc53>>aia 



-C.e.unc53 sma 



AATP y/sPN I I WK PVE EKPT L A V k G V K S T A K K D 

CACCTCCAGCTGTTCCGCCACGTGACACCCA GCCAACAATCGGAGTTGTTAGTCCAATTATGGCACATAAGAAGTTGAC AAATGACCCCGTGATATCTOA 
o.aGAGGTCGACAAGGCGGTGCACrGTGGGTCG GTTGTTAGCCTCAACAATCAGGTTAATACCGTGTAT^TTCAACTGlTr ACTGGGGCACTATAGACT 



- cGFPC.o,ufic53ri»Tia 



" — C e ■ ■n~ c ' 3 Tmn ~ 
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' ^ AA ^ A GAACC TGAAAAGCTCCAATCAATGAGCATCGACACGACGGACG TTCCACCGC T^CCACC TC TAAAATCAGTTGTTC CACTTAAAATGAC TTi'A 

^"TTGGTCTTGGACTrTTCGAGGTTAGTTACTCGTAGCTGTGCTGCCTGC AAGGTGGCGAAGGTGGAGA WtTAGTCAACAAGGTGAATTTTAC TGAAi".** X ° 




A ^CCGAC AACCACCAACGTACGATGT-TC TTCTAAAACAAGGAAAAATCAC ATCGCCTGTCAAGTCG ^^TGGATATGAGCAGTCGTCCGCGrC TGAAGAC 7 

taggctgttggtggttgcatgctacaaga^gat^ Jictf 



- eG FPC. e. uncSasma 



-C.e.unc53sma 



A S E 0 



1 R Q ? P T Y D V I K Q G K I T S P V K S F G Y E Q S S 

CCATTGTGGCTCA TGCGTCGGCrCAGGTGACTCCGCCGACAAAAACTTCTGGTAATCATTCGCTGGAGAGAAGGATGGG 
GGTAACACCGAGT ACGCAGCCGAGTCCACTGAGGCGGCTGTTTTTGAAGACCATTAGTAAGCGACCTCTC TTCCTACC 

•cGFPC, o. unc53sma 



020: 



-C.e.uncS3 sma 



f * VAH ft SAOV TP P TK T S GNHSLE RRHG KNK T S E 3 
CAGCGGCTACACCTCTGACGCCGGTGTTGCGATGTGCGCCAAAATGAGGGAGAACCTGAAAGAAT^^T^rAT^rrr.^^^^.^.r^^.. 



GTCGCCGATGTGGAGACTGCGGCCACAACGCTACACGCGGTTTTACTCCCTCTTCGACTTTCTTATGCTACrGTACTGAGCAGCTCGTGlc 



TTGCCGATA 



32CC 



G 'TSOAGVAMC 



-C.e.unc53 sma 



A KMR EK IKEYOQM TRRAONGV 

Cr-GACAACTTCGAAGACAGTTCCTCCTTGTCGTCTGGAATATCCGArAACAACGAGCTCGACGACATATCCACGGACGATTTGTCCGGAGTAGAC AT.^ 
GGACTGTTGAAGCTTCTGTCAAGGAGGAAC AGCAGACCTTATAGGCTATTGrTGCTCGAGCTGC TGTATAGGTGCCTGCTAAACAGGCC TCATCTG TACC 

* eGFPC «. ;jnc53snm 




DNFEDSSSLSS 



G 1 s 0 NNELOOl STDDLSGVDr' 



CAAC AGTCGCC tc caaaca tagcgac tattcccac tt tgttcgccatcccacgtc ttc ttcc TCAAAGCCCCG AGTCCCCAG 

G-GFCAGCGGAGGTTTGTATCGCrGATAAGGGTGAAACAAGCGGTAGGGTGCAGAAGAAGGAGTTTCGGGGC 



TCGGTCC TCC ACATC AG T 
TCAGGGGTCAGCCAGGAGGTGTAGTCh 



- e;GFFC.*.unc53srna 



n Q em-. — 

AfVASlCH SOYSHFVRHP TSS SS KPRVPSRSS f S V 
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* a ^^ a ^^^^ ^Q^ a caaacttctgtcccagtgccgaacgagccaacgtggcgccgctgccacct caaccttcggacaa 

GCTAAGAGCTAGAGCTCGTCTTGTCCTCTTACACATGTTTGAAGACAGGGTCACGGCTTGCTCGGTTGCACCGCGGCGACGG^^ X > 



• eGFrC9.uftcS3s.Tta 



-C.e.unc53 sma 



0 S R S R A £ Q £ N y YKLLSOCRTSQRG 



AAATSTFGQ 



CATTCGCTAAGATCCCCGGGATCCACCGGATCTAGATAACTGATCATAATCAGCCATACCACATTTGTAGAGGrTTTACTTnr 



TTTAAAAAACCTCCCAC 



GTAAGCGATTCTAGGGGCCCTAGGTGGCCTAGATCTATT GACTAGTATTAGfCGGTATGGTGTAAACATCTCCAAAATGAACGAAATTTTTTnf;flf:f:n jn ™ 



■eGFPC.9.ijnc53sm3 



C.e.unc53 sma 

> G fi T r. e o i t _ 

LLALKMLP 



HSL RSPGSTGSR L { I ISHTTFVEV 



ACCTCCCCCTGAACCTGAAACATAAAATGAAT GCAATTGTTGTTGTTAACTTGTTTATTGCAGCTTATAATGGTTACAA ATAAAGCAATAnrArr,,^, 

TGGAGGGGGACTTGGACTTTGTATrTTACTTACGTTAACAACAACAATTGAACAAATAACGTCGAATATTACCAATGTTTATTTCGTTATCGTAGTGTTT °** 
H L P L N L < H K M N A . V V y N L F , A A Y N G y ^ s N s [ t n 

TTTCACAAATAAAGCATTTTTTTCACTGCATTC TAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTAACGCGTAAATT GTAAGCGTTAArATrTrr.TT 

aaagtgtttatttcgtaaaaaaagtgacgtaagatcaacaccaaacaggtttgagtagtIacatagaat Igcgcattta^attcgcaaItataaaacaa ^ 

' T * A F F S L " S S C G »■ 3 K L 1 N V S . R V h C K R . y p y 

aaaattcgcgttaaatttttgttaaatcagctca ttttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaaga atagacccagata.,, 

TTTTAAGCGCAATTTAAAAACAATrTAGTCGAGTAAAAAATTGGTTATCCGGCTTTAGCCGTTTTAGGGAATATTTAGTTTTCTTATCTGGCTCTATCCC *** 
" ' " K F L 1 " ° L ' F • P ' ' ° N R Q N P L . , K R , p „ p R 

TTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACT ATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACC GTCTATCAGGGCGATGGfrrAr-TAr 

AACTCACAACAAGGrCAAACCTTGTTCTCAGGTGArAATrTCrTGCACclGAGGTTGCA GTTTCCCGCTlrTTGGCAGAlAGTCCCGCTACCGGGTGATG 
' C C S $ L - ! ° E S T ' * I » G L 0 P Q R A K N R L S G R V p T T 

GTGAACCATCACCCTAATCAAGTTrTTTGGGGTCGAG GTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGAT rTAGAGCTTGAr^r.A., 
CACTTGGTAGTGGGATTAGrTCAAAAAACCCCAGCTCCACGGCATTTCGiGATTTAGCclrGGGATTTCCCTCGGGGGclAAArCTCGA^rGCCCCTT; 



T ' T 1 ' K F F G v E V P ■ STKSEP.REPPI 



S L T G >. 



GCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAA AGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCG GTCACGCTGCGCGTAACC ACTArAfrr 
^^^^^^ A ^^^^^^^^^^^^"^^GC ^ ^^^^ GA ^^^ CCACCGTTCACATCGCCAGTGCCACGCGCATTGGTGGTGTGGG 

GAGKCSGHAARNHHT 



A G E R G E K G R £ £ S E R S G R 



GCCGCGCTTAATGCGCCGCTACAGGGCGCGTCAGGT GGCACTrTTCGGGGAAATGTGCGCGGAACCCCTATTTGT TTATTTTTCTAAATACATrrAAAT, 

CGGCGCGAATTACGCGGCGATGTCCCGCGCAGTCCACCGTGAAAAGCCCCTTTACACGCGCCTTGGGGATAAACAAATAAAAAGATTTATGTAAGTTTaI 
R " A - C * * T S » V » V M F S G K C A R n P ; L F , F L N r f K , 

TGTArcCGCTCATGAGACAATAACCCTGATAAATGC TTCAATAATATTGAAAAAGGAAGAG TCCTG AGGCGGA AAGAACCAGC TG TGGAATGTGTGTf Af. 
ACArAGGCGAGTACTCTGTTATTGGGACTATTTACGAAGrTATTATAACTTrTTCCTTC rCAGGACTCCGCCT TTCTTGGrCGACACCTTACACACAGTC 
- S *• " E T 1 T L 1 " A S I | L K K E E S - - • - - 
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rTAGGGTGTGGAAAGTCCCCAGGCrCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGC/UCCAGGTGrGGAAAGT CCCCAGGCTCCC 

AATCCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACGTAG AGTTAATCAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGGCi ^ 
L G C . G K 5 P G S P . A G * S M Q S M H I N ■ S A f R C G K S P G S 

CAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGrCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCA G TTCCGC 

GTCGrCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGGGGATTGAGGCGGGTAGGGCGGGGATTGAGGCGGGTCAAGGC^ ^ 
P A G R S M Q S M H L N ■ S A T I V P p L T p p , p p t T P P S S A 

CCATTCTCCGCCC CATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGT AGrGAGGAGGCTTTTT- 

ggtaagaggcggggtaccgactgattaaaaaaaataaatacgtctccggctccggcggagccggagactcgataaggtcItcatcactcctccgaaaaaa ^ 

HSPP HG . L IFF I Y AEAEAASASE lFQK . , GGFF - 

GGAGGCCTAGGC TTTTGCAAAGATCGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGAACAAGATGGArTGCACGCAG GTTCTCCGGrrnr TTr.r 
CCTCCGGATCCGAAAACGTTTCTAGCTAGTTCTCTGTCCTACrCCTAGCAAAGCGTAC TAACTTGTTC TACCTAACGTGCGTCCAAGAGGCCGGCGAACC *** 
G G L G F C K D R S R D R M R , y S H 0 . T R V , a R R F S G p L 

GTGGAGAGGCTATTCGGCTATGACTGGGCAC AAC AGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGG CGCCCGGTTCTTTTTG 

cacctctccgataagccgatactgacccgtgttgtctgttagccgacgagactacggcggcacaaggccgacagtcgcgtccccgcgggccaagaaaaac - 0: ' : 

GGEAIRL LGTTONRLL --- -- 



C B P. V P A vsagapgsfc 



tcaagaccgacctgtccggtgccctgaatgaa ctgcaagacgaggcagcgcggctatcgtggctggccacgacgggcgtt cc ttgcgcagctgtgctcga 

AGTTC TGGCTGGAC AGGCC ACGGGAC TTAC TTGACGTTC TGC TCCG TCGCGCCGATAGC ACCGACCGGTGC TGCCCGCAAGG AACGCGTCG ACACG AG r T 

ODRPVRCPP T A D n st e> a * •.. . _ 



T A R R G S A A | VAGHOGRSLRSCAR 



53 cc 



CGTTGTCACTGAAGCGGGAAGGGACTGGCTGC TATTGGGCGAAGTGCCGGGGCA6GATCTCCTGTCATCTCACCTT6CTCCT GCCGAGAAAGTATCCATC 
GCAACAGTGACTTCGCCCTrcCCfGACCGACGATAACCCGCTTCACGOCCCCGTCCTAGAGGACAGTA GAGTGGAACGAGGACGGCTCTTTCATAGGTA ' ^ 
" C " • S G K G L A . A ' ° 0 S A G A G S P V 1 S P C S C R E S 1 H 

ATGGCTGATGCAATGCGGCGGCTGCATACGCTTGA TCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAG CGAGCACr,TArrrr.r.. T ,:,: 
TACCGACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCTGGTGGTrCGCTTTGTAGCGTAGclcGC TCGTGciTGAGCCTAr. ' 
" ° ' C " A A A A Y A • S G " l I j « P P S E T S H R A S T Y S p'l 

AAGCCGGTCTTGTCGATCAGGArGATCTGGACGAA GAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAA GGCGAGCATGCCCGArr.r.r„ 
TTCGGCCAGAACAGCTAGTCCTACTAGACCTGCTTCTCGrAGrCCCCGAGCGCGGTCGGCTTGACAAGCGGTCCGAGTTCCGCTCGTACGGGCTGC-v; 
J *■ C " $ ° • 5 G * R * 5 G * * A » » T V R Q A Q G E H A R R R 

GGATCrCGTCGTGACCCATGGCGArGCCTGCTTG CCGAArATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCG ACTGTGGCCGGCTG.r.TnTr.nr- 

CCTAGAGCAGCACTGGGTACCGCTACGGACGAACGGCTT ATAGTACCACC TTTTACCGGCGAAAAGACCTAAGTAGCTGACACCGGCCGACCCACACCG" 
- 3 - ° " " " C L L ' E J " g 0 < V P. L F V , H R L y p a G C oj 

GACCGCTATCAGGACArAGCGTTGGCTACCCGTG ATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTC CTCGTGCTTT ACGGTATCGCCGfT" 

CTGGCGATAGTCCrGTATCGCAACCGArGGGCACTArAACGACTTCTCoAACCGCCGCTTACCCGACTGGCGAAGGA GCACGAAATGCC ATAGCGGCGA3 
GPLSGHSVG YP . YC . RAVRRMG . PLPRALRVRR-3 

C5GArT CG CAGCGCA T CGCCTTCTATCGCCTTCT T GACGAGTrCTTCTGAGCGGGACTCT 5 GGGTrCGAAA rG ACCG ACCAAGCGAC GC ,-rAArrTr.r.-, 

^yA,ccGK G c C T«cccA«Ar«cc W « M crGCTc»«« MC ,c 5 cccTC«ACCcc« C CTTr 4 cr M crc;rrc K rcc 5 G 3irG G*cc,; " 

^' * " R L L S " 5 • » V L t 3 G T L G F E „ T 0 Q A T ? „ L p 
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' TCACSaGAT T T «ATTCCACCGCCGCCrTCTAT GAAAGGTTCGGCTTCGGAATCGTTTTCCGG G ACGCCGGCTGGATGAT CCTCCAGCGCGG G GATCT C , 

agtgctctaaagctaaggtggcggcggaagatactttccaacccgaagccttagcaaaaggccctgcggccgacctact^g aggtcgcgcccctasag; 550 

SRO FDSTAAFYEffLGFG i VFRQA GVM I LQR G 0 L 

TGC TGGAG T TC T TCGCCC ACCCT A5GGGGAGGC FA AC TG AAA CACGGAAGGAGACAATACCGG A AGGAACCCGCGCTATG ACGGCAAT A a &ftflr , GA . 

ACGACCTCAAGAAGCGGGTGGGATCCCCCTCCGATTGACTTTGTGCCrTCCTCTGTTATGGCCTTCC TTGGGCGCGATACTGCCGTTATTTTTCTGTCTT 
" L E ' F A " P " G » L T E T R U T . P E G T R a M T A , K « Q r, 

TA a AACGCACGG TG TrGGGrCGTTTGTTCATAAAC GCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCCCAC CGAGACCCCATT^,rr„ T „ 

attttgcgtgccacaacccagcaaacaagtatttgcgccccaagccagggtcccgaccgIgagacagct^ggggtggcIctggggtaaccccggttatg 8001 

* t H G V G S F V H K R G V R S Q G y H S V 0 T P P R p H V G 0 V , 

GCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAA GTTCGGGTGAAGGCCCA6GGCTCGCAGCCAACGTCGGGGCG GCAGGCCCTecrATAKrrTi*«>; 
CGGGCGCAAAGAAGGAAAAGGGGTGGGGTGGGGGGTTCAAGCCCACTrCCGGGTCCCGAGCGTCGGTTGCAGCCCCGCCGTCCGGGACGGTATCGG.Grc 6,0C 

arvssfspphpp ssgegpgl aanvgaagpa i a **s 

GTTACTCArATATACTTTAGATTGATTTAAAACTTCAT TTTTAATTTAAAAGGATCTAGGTGAAGATCCTrTTTGATAA TCTCATGACCAAAA^rrrTA 

CAATGAGTATATATGAAATCTAACTAAATTTTGAAGTAAAAATTAAATTTTCCTAGATC CACrTCTAGGAAAAACTATTAGAGTACTGGTTTTAGGSAAT «« 
G? S y IL.IOLICLhf FKRI 



, V * ILFONLMTICIP 



ACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTA GAAAAGATCAAAGGATCTTCTTGAGATCCTTTTrTTCT GCGCGrAATCT G CT G r T T,r„,-- 

TGCACTCAAAAGCAAGGTGACTCGCAGrCTGGGGCATCrTTTCTAGTTTCCTAGAAGAAC TCTAGGAAAAAAAGACGCGCArTAGACGACGAACGTTTG; *» 
' A SDP VE )C [ KGSS , OPFFLRV I CCLOT 

AAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGAT CAAGAGC TACCAACTCTTTTTCCGAAGGTAAC TGG rTTrAfir Aft A^rff<-i-.r ■ 

TrTTrTGGTGGCGArGGTCGCCACCAAACAAACGGCCTAGT T CTCGArGGT T GAGAAAA:GG CTTCCATT G ACCGAA G TCGTCTCGCGTCTAT G GTT.A; 

^•^^^V TGFSRAQ I P fl 



Cr.TCCTTCTAGrGTAGCCGTAGTTAGGCCACCACTTCAA GAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAAT CCTGTTACCAGTGGCTr.--^ 

GA,AGGAAGATCACATCGGCATCAArcCGGT G GTGAAGT:CTT G A G ACA;c G T G GCG G AT G T A TGG A G CGAGACGATTAG G ACAATGGTCACCGAC^4 "« 
L ■ ' ' ' L G " " F * " * V A P P T r L A L L ■ L L P V A /a 

CAG ^^^ GA f A AG ^^^^^^^^^^CC G GGTTGGACTCAA G A CGATAGTTACC GG ATAAG G C G CA G C G GTCGGGCT f;AArnf;nrif:f:Trrf:T/T>- .« 

G T ACCGCrATTCAGCACAGAAT G GCCCAACCT G AGrTC T GCrA T CAATG G CC T ATTCC G C G TCGCC AGCCC G ACTT G CCCCCCAA G CACGrG T GTo;, r ^ 
i i SCL TGLOSR R . LPDKAQRSG . TQGSC TOP 

rGAACCCGT.GCrGGATGTGGCTrGACTCTATGGATGTCGCACTcjTACTCTTTCGCGGTG CGAAGGGCTTCCCTCTTrCCGCCTGrcCATASO, ' 
E L R , L o R E L . ESATU P E G R K A D R Y p" 



J GC G G AACA G GAGA G CGCA CGAG G GA G CT rcCA G GG G G AAACGCCrGGrA T C TT TArAGrCCrGTC GGG rTrCGCCACCrC T G ACT,G, 

TTC « GT "*'^^ 5 ,, : 

^^E LPQGHAV Yl y SPVGFRH|_ LE 
pCGTCGATrrrTGrGATGCTCGTCAGGGGGGCGGAGCCTATG G AAAAACGCCAGCAA^ 

~ ' C S 5 ° ° " 5 L . V K N ^ S N A A F L R F LAFCWPFi 
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TAGTTATTAATAGrAATCAATTACGGGG rCATTAGTTCArAGCCCATATATGGACTT CCGCGTTACATAACrTACGGTAAArGG CCCGCCr^CTG'C." 

ATCAATAATTATCATTAGTTAATGCCCCAG7AATCAAGTATCGGGTATATACC fCAAGGCGCAATGTArTGAATGCCATTTACCGGGCGGACCGAC TGg"^ 
■ • L L ■ ' V ' 1 Y G V ! S » • " ' V C V P B V | T V G K V P A V l T " 

CCCAACGACCCCCGCCCATrGACGTCAATAATG ACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGT'CA ATGGfiTnnAnrATTTAr/'.-- 

GGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGGTTAKCCTGA AAGGrAACTGCAGlTACCCACCTCATAAATGCC- 
A " R " " " ° V " " ° V C 5 H S M A N R Q r P L t S H G G V F T ■■ 

AAACTGCCCAC T TGGCAGTACATCAAGTGTATCA TATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCC CGCCTGGCATTA T GrrrA,T, 
TTTGACGGGTGAACCGrCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAACrGCAGTTA CTGCCATTTACCGGGCGGACCGTAATACGGGTCAT 

" C " L C 8 T S S V S l_j A P Y ■ R ° • » ; H A R L A L C P y ^ 

CATGACCTTATGGGACrTTCCTACTTGGCAGTACATC T ACGTATTAGTCATCGCTATTACCArGGTGATGCGGTTTTGGC AGTACATCAArflfifirr;Tr:r:A 

gtactggaataccctgaaaggatgaaccgtcatgtagatgcataatcagtagcgataatggtaccact acgccaaaaccgtcatgtagtIacccgcac^ iC ° 



H 0 L H G L S y L A V HLR ISHRryHGO 



lCGTC atg tagt tacccgcacc t 

AVLAVHOVAW 



TAGCGGTTTGACTCACGGGGATTTCCAAGTCTC CACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATC AACGGGACTTTCr AAAATfiTrnTV. 

ATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCfGAAAGGTTTTACAGCAT 600 
' *■ ' i L T G ' 3 K S P P " - * 0 V jj V L A P K S T G , S K „ S 

ACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGT GTACGGTGGGAGGTCTATATAAGCAGAGCTGGTrTAGTGAAC CGTCAGATCCGCTAGrGr-r, 

TGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATATA TlcGTCTCGACCAAArCACTTGGCAGTCTAGGCGATCGCGA. 
° - L " ' ° A " ° R • * C . T " ° B L Y K 0 3 V F S E P S 0 P L A L 

CCGGTCGCCACCATGGTGAGCAAGGGCGAGGAGCrG rTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACG GCG,CGTAAACGGCCACAAGrT C Ar..--: 
GGCCAGCGGTGG TACCACTCGTTCCCGCrcCTCGACAA GlGGCCCCACC^GGGTAG GACCAGCTCGACCrGCCGCTGCATrTGCCGGTGTrrA^r,.;- *» 



" eGFPC.e.unc53ecl 

» V A T M V S K G E £ L F T G V V P I L 



VELOGDV.NGHKFS 



TuTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAHTrr ATrTrr 



, ACCACCGGCAA GCTGCCCGTGCCCTGGCCCACCCTCGrG4" 
AlAGGCCGC TCCCGCTCCCGCTACGGTGGATGCCGrTCGA CTGGGACTTCAAGTAGACG TGG TGGCCGTTCG AC.CiKKr zrr.r.r.,\rrr.r.r r~ * 



-eGFPC.e.unc53ed 



- $ . C ' G E G ° * T Y G ' L T l « T ■ C T T G K U P y p y p T L v T 

CACCCTGACCTACGGCGTGCAGrGCTTCAGCCGCTACCC C GACCACATGAAGCAGCACGACTTCTTCAAGrCCGCCArGCCC GAAGGCTACGTCCAGGA, 
G.GG u ACrGGATGCCGCACGTCACGAAGrCGGCGATGG GGCTGGTGTAC:TCGTCGr C CrGAAGAAGTTCAGGCGGTACGGGCTTCCGA; GC A,r.TrrT- *« 



■eGFPC.e.unc53ed 



T > T ' G V ° C F S R Y r ° H M K 0 H D F F K S A r P E G r 



V Q E 



CGCACCATCTTCrTCAAGGACGACGGCAACTACAAGAC CCGCGCCGAGGTGAAGrTCGAGGGCGACACCCrGGTG AA.:.:GCATCGAGCT G AAr.,. r Arr- : 
GCG TGGTAGAAGAAGrTCCTGCTGCCGTTGATGTTCrGGG CGCGGC TCC AC TTCAAGCTC CCGCTGTGGGACC AC TTGGCGTAGCTCnArrTrrrr.T. 



-eGFPC.e.unc53ed 
Y * r R a E v K F £ 



T 1 ' ' ' ° ° ° " * ^ — — - _0_0__T L V „ R , £ L K c 
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AC TTCAAGCAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAAC FACAACAGCCACAACGTCTATATC ATGGCCGACAAGCA GAAGAACGGCATC&A 



-eGFPC.e.unc53ecl 



dfkeognilghkleyn y n s h n 



VYt MAOKQKNG 



I \ 



GGTGAAC TTC AAGATCCGCCACAAC atcg 



ZGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCC CCGTGCTGCTo 
CCACTTGAAGTTCTAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCGGC TGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACGACfiAr '** 



L A 0 H Y Q Q N T P I G 0 G P V L L 



CCCGACAACCACTACC 



TGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCAC 



ATGGTCC TGCTGGAGTTCG TGACCGCCGCCGGGA 



GGGCTGTTGGTGATGGACTCGTGGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTGTACCAGGACGACCTCAAGCAC TGGCGGCGGCC'-T 



1 3o: 



-eGFPC.e.unc53ecl 



P on h ylst qs als kdpnekrdhmvllefvt 



A A G 



TCACTCTCGGCAT GGACGAGC TGTACAAGTCCGGACTCAGATCTACGTCAAATGTAGAAfTGATACCAATC TACACGGATTGGGCCAATCGGCA CC TTTI.' 
AGTGAGAGCCGTACCTGCTCGACATGTTCAGGCCTGAGTCTAGATGCAGTTTACATCTTAAC TATGGTTAGATGTGCCTAACCCGGTTAGCCC;Tf;r.Aiii>.-: 



i no: 




C T L G M 0 E L 



C.e.unc53 eel 



Y K S G L R S T S N V £ L I P I Y T D y A N R H L 3 
GAAGGGCAGCTTATCAAAG ^CGATTAGGGATATTTCCAATGATTTTCGCGACTATCGACTGGTTTC TCAGCTTATTAA TG TGATCGTTCCGATCAA CC^A 
CrTCCCGTCGAATAGTTTCAGCTAATCCCTATAAAGGTTACTAAAAGCGCTGATAGCTGACCAAAGAGTCGAATAATTACACTAGCAAGGCTAGTT.rT- 



-eGFPC.e.unc53ed 



-C.e.unc53 ed 

K ■ * S - L S K S ! » P j S N 0 FROYRLVSOLIHVIVP 



I M E 



TTCrcGCCTGCATrCACCAAACGTTTGGCA A AA ATCACATCG A ,CCTGGATGGCCTCGAAACGTGTCTCGACTACCTGAAAA ATC TGGGTC rCG AC TW ' 

aagagcggacgtaagtgctttgcaaaccgttttta gtgtagc ttggacctaccggagctttgcacagagctga tggactttttagac ccagagctg acg! 




F S " A F T K * L A , * ' I S N LOGIETCLOYL.CNL 



G L D C 



CGAAACTCACCAAAACCGATATCGACAGCGGAA ACTTGGGTGCAGTTCTCCAGCTGCTCTTCCTGCTCTCCACCTACAAGCAGAAGCTTCGG CAAC km 
GCTTTGAGTGGTTTTGGCTATAGCTGTCG CCTTTGAACCCACGTCAA GAGGTCGACGAGAAGGACG AGAGGTGGATfiTTrnrrTTrftA^r.rrr. ri-r *.-t- 




•>* L T K T 0 I 0 S G N 



L ? fl v L 0 L LFLLSTVKQKLaOL» 
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AAAAGATCAGAAGAAArTGGAGCAACTACCCACA TCCATTATGCCACCCGCGGTTTCTAAArTACCCTCGCCACGTGTCGCC ACG TCAGCAACCGCTTf'A 
^ rT CrAGTCTrCTTTAACCTCGTTGA TGGGTGTAGGTAATACGGTGGGCGCCAAAGATTTAATGGGA GCGGTGCACAGCGGrGCAGTCGrTGGCG, ' 




K 0 0 K K L E Q L p T S 'MPPAVSKLPSP 



"VATSATA 



GCAACTAACCCAAATTCCAACTTTCCACAAATGTCAA CATCCAGGCTTCAGACTCCACAGTCAAGAATATCGAA AATTGATTCATCAAA G ATT fl r.TAT,-, 
CGTTGATrGGGTTTAAGGTTGAAAGGTGTT TACAGTTGTAGGTCCGAAGTC TG AGGTGTC AG TTCT TA TAGCT TTTAACTAAG TAGTTTr T a arr at ji"^ = ** 




* T NPNSNFPQMS 



T SRLQTPQSRISKipsSKIGl 
A6CCA AAGACGTCTGG ACT TAAACCACCCTC A TCATCAACC ACT TCATC A AATAATACAA AT TC AT TCCGTCCGTrGA 




Kf> KTSGLKPP S 



S S T T S S , " " T N S F R P 3 S R S S G N M r-j 
rGTTGGCTCGACGATATCCACATCTGCGAAGAGCTTAGAATCATCATCAACGTACAGCT 



ITCTA 



acaaccgagctgctataggtgtagacgcttctcgaatcttagtagtagttgcatgtcg7g77aI177 



rTTCGAATCTAAACCGACCTACCTCCCAACTCCAAAAA 



TTAGATTTGGC tggatggagggttgaggttttt 




v GST 1 S T S A 
CCTTCTAGACCACAAACCCAGCTAG7 



I S N L fJ R P T S 0 L Q 



■ ~ :TTCGTGTTGCTACA ACTACAAAAATCGGAAGCTCAAAGCTAGCCG CTCCGAAAGCCGTGAGCACC^^ArTT., 

^AAG^TCTGGTGrTTGGGTCGATCAAGCACAACGArGTTGATGTTTTTAGCCTTCGAGTTTCG, ^ 




? SRPOTQU 



. V R . V A r T T K I G S S 



KLAAPJCAVSTP 



K L 



CTTCTGTGAAGACfATTGGA 



AGCAAAACAAGAGCCCGATAACAGCGGTGGTGGTGGTGGTGGAATGC T GAAATTAAAGTTATTC AGTAfir j&^AAArrr.r t~ 
jAAGACACTTCTCATAACCTCGTTrTGTTCTCGGGC TAT ^GTCGCCACCACCACCACCAC CTTACGACTTTAATTTCAATAAGTCATCGTTTTTGGGTA2 




A S V K T I G A 



KO E P 0 N S G G G G G 



GMLKLKLFSS 



N P > 
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14/270 
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fig 32 pEGFPec) (1>6700) Site and Se^ nra 



Page V 



' f ^ A rCCAATAGCCCACAACCTACGAGA AAGGCGGCGGCGGTCCC TC AACAACAAAC TTTGTCGAAAATCGCTGCCCCAG rGAAA AGTGGCCTGA Ao 

AAGGAGTAGCTTATCGGGTGTTGGATGC ^TTTCCGCCGCCGCCA CGGAG T fGTTGTT TGAA AC AGCTTT TAGCG ACGGGG TC AC TTTTrAr rr;.': * r t t- 

-eGFPC.e.unc53ed 




CCGCCGACCAGTAAGCTGGGAAGTGCCACGTCTATGTCGAAGCTTTGTACGCCAAAAG 



TTTCCTACCGTAAAACGGACGCCCCAA TCATATC TCAACAAG 



GGCGGCTGGTCATTC6ACCCTTCACGGTGCAGATACAGC TTCGAAACATGCGGTTTTCAAAGGATGGCATTTTGcTTGCGGGn' 



TTAGTATAGAGTTGTTC 



25C« 



■eGFPC.e.uncS3ecl 




ACTCGAAACGAT G CTCA a A6.GCAGTGAAGAA G AGT CCGGATACGCTGGATTCAACA6CACGTCGCCMCGTCATC.TC GACG C AAGGTrcr CT .,r.r.r 

tgagctttgctacga gtttctcgtcacttcttctc.^.ccta pcgaccIaagttgtcgIgcagcggttg cagtagtagctgccttcc I^ g^T^ 

eGFPC.e.uncS3ed 




OSKRCSKSSEE 



£ S G Y A G . F N 3 T S P T S S S T £ G S L $ M 
GCATTCCACATCTTCCAAGAGTTCAACGTCAGACGAAAAGTCTCCGTCATCAGACGATCTTACTCTTAACGCC TCCATCG TGACAGCTATC AGAf Afirrr; 



CGTAAGGTGTAGAAGGTrCTCAAGTTGC AGTCTGCTTTTCAGAGGCAGTAGTCTGCT AGAATGAGAATTGCGG 

^eGFPC.e.uncSSed 



AGGTAGCACTGTCGATAGTCTGTCGGC 



7£i" 



-C.e.unc53 ed 



H S T S 5 * S S T ^ D E K S P S S D D L T L N 

Ar.CCCGCAACACC GGTTTCTCCAAATArTArCAACAAGCCTGTTGAGGAAAAACCAACACTGGCAGTGAA AGGAG 
rA "GCCCTTGTCGCCAAAGAGGTTTATAATAGTTGTTCGGACAACrcCTTTTTCGT7GTGACCCTC A rl 



A S I V T A 



R 0 P 



TGAAAAGCACA GCGAAAAAAGATC 
TTCCTCACTTTTCGTGTCGCTTTTTTCTACi 




i A A T P v 3 P N ! 



N K ? V E £ « P T . I * V « G V K S T A K K D 
C^CTCCAGCTGTTCCGCCACGTGACACCCAGCCAACAATCGGAGTTflTTAf 



GTGGAGG TCGACAAGGCGG TGCACTG TGGGTCGGTTG TT AGf fTr a Ar 



AGTCCAA TTATGGCACATAAGAAGTTGACAAA rGACCCCGrGAfATC TGm 



AATCAGGTTaaTACCGTGTATTC ttc aac tgtttac tggggcac tatagac t 
eGFPC.e.unc53ed 



2&X 



*C.e.unc53ed 

P^PAyppRQ T 0 P T I G V 7 S P 



P V I S E 
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fig 32 pEGFP ed ( 1 > 6700) Site and Sequence 



AAACCAGAACC ^ AAAA ^^TCCAATCAATGAGC ATCGACACGACGGACGTTCCACCGC TTCC^CCTCTAAAA TCAGTTG TT CCACTTAAAA Tfizr r:/-.-. 
"TTGGTCTTGGACTTTTCGAGGTTAGrTACTCGT AGCTGTGCTGCCTGCAAGGTGGCGAAGGTGGAGA TTTTAGTCAACAAGGTGAATTrTACTGAAG^ " 



Page ( 




IRO PPTYDVLLKQ GKITSPV KSFGYEOSSASED 

CCATTGT G aCTCAT G C G TC GG CTCAG G T G ACTCCGCCG ACA A ^CTTCT GG TAATCATrCGCTC G A G , G AA G GA T G GGAAAGAAT.AG A CArr,, A , r , 
GGTAACACCGA G rACGCAG^CCGAGTCCACTGAGGCG G CT G TTTTTGAA G ACCATTA G T AAGCGACCTCTCTTCCTACCCTTTCTTATTCTGTAGTCTTAG 




CCTGACAACTTCGAAGACAGTTCCTCCTTGTCGTCTGGAATAr C CGArAACAACGAGGGGATCCACCGGATCTAGATAACTGATC ATAATC AGCCATAf"' 
^u^LT^TTGAAGCrrcrGTCAAGGAGGAACAGCAGACCTTATAGGCTATTGTTGCTCCCCTAGGTGGCCTAGA TCTATTGACTAG TATTAG ICGGTA'tii; ~ 



-C.e.unc53 eel 



I H R I 



T 0 H N 0 p Y 



P 0 " F E p s s s l s 3 g i Sonne 

"T T r° rmAcTTGcm ^ 

< AAAC»rcrcCAAAA tGAACGAAATTrTTTGGAGGGTGTGGAGGGGGACTTGGACTTTGTATTTTA CT'TACGTTAACAACAACAATTGAACAAA TAA^.' « 
ICRGFTCFKKP P t P P p E p e t . N E c N c c c L V Y ■" 
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fig 32 pEGFPed ( 1 > 6700) Site and Sequence Pa 9e I 



TOTATCTTAACGCGTAAATTGTAAGCGTTAA TATTTTGTrAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCArTTTTTAACCAATAGGCCGA AA rCG3 
ACATAGAATTGCGCATTTAACATTCGCAATTATAAAACAATTTTAAGCGCAATTTAAAAACAATTTAGTCGAGTAAAAAATTGGTTATCCGGCTTTAGCC "° 

P K S 



I L T R K L . A L I F C . NS R . | F v K S A H F L T N R 



CAAAATCCCTTATAAATCAAAAGAATAG ACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAAC GTGGACTCCAACGT- 
CTTTTAGGGAATATTTAGTTTTCrTATCTGGCTCTATCCCAACTCACAACAAGGTCAAACCTTGTTCTCAGGTGATAATTTCTTGCACCTGAGGTTGrA^ ^ 
* K S L ' N °- « N R P R • G ■ * «■ F Q F G T ft y H Y , R T V T P T j 

AAAGGGCGAAAAACC GTCTATCAGGGCGATG6CCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGTCGAGGT GCCGTAAAGCACTAAATCGGA 

tttcccgctttttggcagatagtcccgctaccgggtgatgcacttggtagtgggattagWcaaaaaa ccccagctccacggcatttcgIgatttagc^ 

G E K P S I R A MA H Y V N H HP NO VFVGRGAVKH | G 



ACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAA 



AGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGC7 



tgggatttccctcgggggctaaatctcgaactgcccctttcggccgcttgcaccgctctttccttccctIctttcgctttcctcgcccgcgatcccgcga " 00C 
tlkgapqlelogesrr tvre r < g r k r k e r A l g r 



ggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtcaggtggcacttttcgggg, 



AAATGTGCGC 



CCGTTCACATCGCCAGTGCGACGCGCATTGGTGGrGTGGGCGGCGCGAATTACGCGGCG^TGTCCCGC GCAGTCCACCGlGAAAAGCCCCTTTACACGC; 



v ° v • R S R . C A PPHPPRLMRRYR 



ARQVALFGEMC 



GGAACCCCTATTTGTTTATTTTTCTAAATACATT CAAATATGTATCCGCTCATGAGACAArAACCCTGATAAATGCTTCAAT AATATTGAAAAAr.r.AA f ;, 
CCTTGGGGATAAACAAATAAAAAGATTTATGTAAGTTTATACArAGGCGAGTACTCTGTTATTGGGAC TATTTACGAAGTTATTATAACTTTTrcCTTCT ^ 
EPL FVY FSKYtQ I C [ R5-0NNPQKCFNN I £ K 6 B 

GTCCTGAGGCGGAAAGAACCAGCTGrGGAATGTGT GTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG CAGAAGTATGCAAAGCATGCATCT.- 

caggactccgcctttcttggtcgacaccttacacacagtcaatcccacacctttcaggggtccgaggggIcgtccgtctIcatacgtttcgtacgtag^ 13iC 

- L * K E " A V E C V 5 V * V » ' V P. » L P S R Q K Y A r H A s" 

A AT T AG TC AGC A ACC AGGTGTGG AAAG TCC CC A G GC TCCCCAGCAGGC AG A AG TATGCAAAGCATGCATCTC A ATT AG TCAGCAACC AT AG TCCffirrf-* 
TTAATCAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACG TTTCGTACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGG'jG 



Q L V S N 0 y VKVPRLPSROKYA 



KHASOLVSNHSPAP 



TAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGr TCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTT TTATTTATGCAGAGGCCGAGGCCGCCT- 

gaggcgggtagggcggggattgaggcgggtcaaggcgggtaagaggcggggtaccgactgattaaaaaaaataaatacgtctccggctccggcggau t5K 

FFYLCRGRGRL 



" S A . " " A P " S AQFR PFSAPULTN 



GGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGC TTTTTTGGAGGCCTAGGCTTTTGCAAAGATCGATCAAGAGACAGG ATGAGGATCGTTTCGCATr.AT 

CCGGAGACTCGATAAGGTCrTCATCACTCCTCCGAAAAAACCTCCGGATCCGAAAACGTlTCTAGCTAGlTCTCTGTCclACTCCTAGCAAAGCGTACTA ^ 
G 1 • A ' " E V V R R «• 1 * » P B L L Q R S , K R Q p E Q R , A 

TGAACAAGATGGATTGCACGCAGGTTCTCCGGC CGCTTGGGTGGAGAGGCrATTCGGCTATGACTGGGC ACAACAGAC AATCG GCTGCTCTGATGCCGC 
ACTTGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGlGTTGTCTGTlAGCCGACGAGACTACGGCGi 

lmkmq ctqvlrplgvrg ysamtghnrqsaalmpf 

GrGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTC TTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAAGA CGAGGCAGC G cr.,,rTArr,T 
CA:AAGGCCGACAGTC G CGTCCCCGCGGGCCAAGAAAAACAGrTCTGGC;GGACAGGCC:CGGGACTT ACrTGACGTTC;GCTCCGTCGCGCCGATAG.-l ^ 
C 3 C C ° R ■ » G A R F F «; S R P T C P V P . M M C K r R 0 R G Y R 
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fig 32 pEGFPecl (1 > 6700) Site and Sequence Pa 9 e 7 



GGCTGGCC0CGACGGGCGTTCCTTSCGCAGCTGTGCTCGACGTTGTCACTGAA6CGGGAACGGAC 



TGGC TGCTATTGGGCG AAG TGCCGG GGCAGGATC T 
CCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGAC TTCGCCCTTCCCTGACCGACGA TAACCCGCTTCACGGCCCCGTCCTAGA ^ 

gwprraflaqlcstlslkre g r G C Y V A k c R G R I 



CCTGTCArCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCreCATACGCTTGATCCGGCTACCrG CCCATTCGArfAr 

GGACAGTAGAGTGGAACGAGGACGGCTCTTTCATAGGTAGTACCGACrACGTTACGCCGCCGACGTATGCGAACTAGGCCGArGGACGGGTAAGCTGGIS 
S C " L T L L . L P R .« Y P S V L M 0 C G G C I R L I R |_ P A H S T T 

CAAGCGAAACATC6 CATCGAGCGA6CACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCA6GG GCTCGCGCCA6CCG 
GTTCGCTTTGTAGCGTAGCrCGCTCGTGCATGAGCCTACCTTCGGCCAGAACAGCTAGTCCTACTAGACCTGCTTCTCGTAG TCCCCGAGCGCGGTCGGC 
K * " ' A S S E H V «• « W * P V > S I R M IWTKS IRGSRQP 



AACTGTTCGCCAGG CTCAAGGCGAGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATArCATGGTG GAAAATGGCCG 
TTGACAAGCGGTCCGAGTTCCGCTCGTACGGGCTGCCGCTCCTAGAGCAGCACTGGGTACCGCTACGGACGAACGGCTTATAGTACCACCTrTTACCGGC ^ 

IC M A 



" C S P G S ? R A C P T A R | SS.PMAMPACRISWW 



CTTTTCTGGATTCATCGACTGTGGCC6GCTG GGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAA GAGCTTGGCGGCGAA 

gaaaagacctaagtagctgacaccggccgacccacaccgcctggcgatagtcctgtatcgcaaccgatgggcactataacgacttctcgaaccgccgctI 53<X 

A F L D S S T V A G V V V R T A I R T . R y L p y , L L K s A 
TGGGCTGACCGCTTCCTCGTGCTTTACGG TATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCrTGACGAGTTCTTCTG AGCGGGACTCT 

acccgactggcgaaggagcacgaaatgccatagcggcgagggctaagcgtcgcgtagcggaagatagcggaagaactgctcaagaagacIcgccctgaga 

G L T . " 3 $ C F T V S P L . P ' » » A S P S . A F L T S S S E R Q S 

ggggttcgaaatgac cgaccaagcgacgcccaacctgccatcacgagatttcgattccaccgccgccttctatgaaaggttggg cttcggaatcgttttc 

CCCCAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGlAGTGCTCTAAAGCIAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTrAGCAAAAG 
GVRNDRPSOAQ 



P A . 1 T R F R F H RRLU KVGLRNRF 



CGGGACGCCGGCTGGATGATCCTCCAGCGCGG GGATCTCATGCTGGAGTTCTTCGCCCACCCrAGGGGGAGGCTAACTGAAACACGGAAG GAGACAATAC 
GCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCrCAAGAAGCGGGTGGGATCCCCclcCGATTGAciTTGTGCCTTCCTCTGTTATG MS 
" G ■ " " L ° °. P " A * G g " A G V I R P P . G E A „ . w T 

CGGAAGGAACCCGCGCTATGACGGCAATAAAAAG ACAGAATAAAACGCACGGTGTTGGGT CGrTTGTTCATAAACGCGGGr.rrr.r.Trrr,^^.-..-, 



GCCrTCCrTGGGCGCGATACTGCCGTTATTTTTCTGTCTTATTTTGCGTGCCACAACCCAGCAAACAAGlATTTGCGCCCCAAGCCAGGGTCCCGACCG; 



G RNPRY QGNKK TE . NA RCVVVCS . TRGSVP 



G L A 



C 



TCTGTCGATACCCCACCGAGACCCCATTGGGGC CAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGT TCGGGTGAAGGCCCAGGGCTC 
GAGACAGCTATGGGGTGGCrCTGGGGTAACCCCGGTTATGCGGGCGCAAAGAAGGAAAAGGGGTGGGG rGGGGGGTTCAAGCCCACTTCCGGGTCCCGAG *** 
L C ■ Y " T E T " «■ « P ' * P H f F L F P r P P P K F G ■ R P R A 

GCAG CCAACGTCGGGGCGGCAGGCCCTGCCATAGCCTCAGGTTACTCATATATACTTTAGATTGATTTAAAACTTCAT TTTTAATrTAAAAGGATCTAG-, 
CGrCGGTTGCAGCCCCGCCGrcCGGGACGGTATCGGAGrCCAATGAGTATArATGAAATCTAACTAA ATlTTGAAGTAA^ATTAAATTrTCCTAG-Tr ' **' 
PSQRRG G RPCHSURLL IYTL O.FKTSFl I . !C 0 l "(i 

TGAAGATCC TTTTTGATAA TC TCATGACCAAAAT CCC TT AACGTGAGTTT TCG TTCC ACTGAGCGTCAG ACCCCGT AGAAA AG ATCAAAG3 ATC TT"T T': 
AC TIC rASGAAAAACTATTAGAGTACTGGTTTTAGGGAATTGCACTCAAAAGCAAGGTGACTCGCAGTC TGGGGCATCTTTTCTAGTTTCCTAGAAGAA" °' * 
' ° " F ■ ■ $ " ° ° " " «• T ■ V ,F V P L S V B p RRKOOR IFL 
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fig 32 pEGFPed (1 > 6700) Site and Sequence Pa 9 e * 

agatcctttttttc tgcgcgtaatctgctgcttgc aaacaaaaaaaccaccgctaccagcgg tggtttgtttgccggat caagagctaccaactcttttt 

TCTAGGAAAAAAAGACGCGCATTAGACGACGAACGTTTGTTTTTTTGGTGGCGATGGTCGCCACCAAACAAACGGCCTAGTTCTCGATGGTTGAGAAAAA ^ 
P S F , F S A R " L I. I A N K K T T A T S GGLFAGSRATNSF 



CCGAAGGTAACTGGCTTCAGCAGAGCGC AGATACC AAATACTGTCC TTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTA GCACCGCCTh 
GGCTTCCATTGACCGAAGTCGTCTCGCGTCTA^ 

5£ gnvlqqsaqtkycpss vavvrpp uqelcsta 



62C-: 



Y 



catacctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaaga 

gtatggagcgagacgattaggacaatggtcaccgacgacggtcaccgctattcagcacagaa tggcccaacctgagttctgctatcaatggcctattccg 33CC 

t P R S A N P V T S G C C Q V R , y y S y R V g L K T I V T 6 . G 
6CAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACA6CCCAGCTTGGAGCGAAC6ACCTACA 

CGTCGCCAGCCCGACTTGCCCCCCAAGCACGTGTGTCGGGTCGAACCTCGCTTGCTGGArGTO 5UCC 
A A V G L N G G F V H T A QLGANOLHRT- EIPTA.AMRK 

GCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAAC GCC 

CGGTGCGAAGGGCTTCCCTCTTTCCGCCTGTCCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCGAAGGT— " — " + 35C< 

RHASRftE KGG QVSGK ROGRNRRAHEGASRG 



TGGT 

TCCCCCTTTGCGGACCA 
K R L 



ATCTTTATAGTCCT GTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGrCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAAC GC 

TAGAAATATCAGGACAGCCCAAAGCGGTGGAGACTGAACTCGCAGCTAAAAACACTACGAGCAGTCCCCCCGCCTCGGATACCTTTTTGCGGTCGTTGCG ^ 
S L •■ S C R V 5 P P L T ■ , A S I F V M L V R G A E P M E K R Q 0 R 

GGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCATGCA T 
CCGGAAAAATGCCAAGGACCGGAAAACGACCGGAAAACGAGTGTACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGCATAATGGCGGTACGTA "** 
gLTTVPGLLL AFCSHV LSCV1P.FCG.PYYRHA 
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fig 33 pEGFPxba (1 > 5447) Site and Sequence p age » 

Enzymes : 72 of 1 46 enzymes (Filtered) 

• Sett ' n fl 3: Linear. Certain Sites Only, Standard Genetic Code 

TAGTrATTAATAGTAArCAATTACGCGGTCATTAG TTCAfAGCCCATATATG GAGTTCCGCGrTACATAACTTAC GGTAAArGGCCCG cZTGnFrfllrr^ 

ATCAArA.TTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGrATlGAATGCCATlTACCGGGCGGACCGACrGG- 
* W ^GV I SS . P | VGVPft Y I T YGK V P a V |_ j ' 

CCC AACGACCCCCGCCCATTGACGTC AATAATGAC GTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGr CAATGGGTfi^Ai'TA TTTArr.-- 

gggttgctgggggcgggtaactgcagttattactgcatacaagggtatcattgcggttatccctgaaaggtaactgcagttacccacctcataaatgcch 

A 0 R P P p *.^ VN NOV CSHSNAN RQFPLTSMGGVF TV 

^ACrGCCCACTTGGCAGTACATCAAGTGTArCATA TGCCAAGrACGCCCCCTATTGACGTCAATGACG G TAAATGGC CCGCCT GG CA T TA T r. f -r, 1 . T - 

TTTGACGGGTGAACCGrCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGG TCA r 
N C P L GSTSSVS YA KYAPY . R Q .R . MARlalCPV 

CATGACCTTATGGGACTTTCCTACTTGGCAGTACA T CTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCA G TACATCAATnGnrnTr.,i.\ 

gtactggaataccctgaaaggatgaaccgtcatgtagatgcataatcagtagcgataatggtaccactacgccaaaaccgtcatgtagt^cccgcacct UC ° 

D L . H 6 L S Y L A V H L R ! S H R Y Y H G 0 A V L A V H 0 V A V 

TAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCA CCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAA AATCAACGGGACTTTCCAAAATfiTrnT^ 
ATCGCCAAACTGAGrGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTlACCCTCAAACAAAACCGTGGTTTTA GTTGCCCTGAAAGGTrTTACAGCA; *» 
V • L T 5 ' 5 K S P P " • R ° " ^ V U A P K S T G t S . M S , 

ACAACTCCGCCCCATTGACGCAAATGGGCGGTAGG CGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTG AACCGrCAGAT.r.rrA,^^- 

tgttgaggcggggtaactgcgtttacccgccatccgcacatgccaccctccagatatatt cgtctcgaccaaatcacttggcagtctaggcgatcgcgat 

J ' " " ' ° A N G » ■ * C T " G C L V K 0 S V F s E P S D P u A L 

CCGGTCGCCACCATGGTGAGCAAGGGCGAGGAGCTG TTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGG ACGGCGACGTAAACGGCCACAAGrTrAnr" 
GGCCAGCGGTGG TACCACTC GTTCCCGC ^CTCGACAAGr GGCCCCACCACGGGTAGGA CCAGCTCGACCTGCCGC TGCATTrGCCGGTGrTCAAGTCG T 730 

" V A T " V S K G E E L F T G ^ » ■ U V E L 0 G 0 V N G H < F 3 

TGTCCGGCGAGGGCGAGGGCGArGCCACCTACGGCAAG CTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCC CGTGCCCTGGCCCACCr.rrr.r.-.,- 
ACAGGCCGCKCCGCTCCCGCTACGGrGGATGCCGTTC GAC T GGGACrTCAAGTAGACGlGG TGGCCGTlcGACGGGCACGGGACCGGGlGGGAGCACT; W 

~ ~ • eGFPg7unc53xba 

V 3 G ' g E G ° A T Y G ^ T ^ ^ ■ C T T G K L P V p y p r L v T 

CACCCrGACCTACGCCGTGCAGTGCTTCAGCCGCTACCC CGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCA rGCCCGAAGGCTACGTCCAGGA, 
G^oGACTGGATGCCGCACGrCACGAAGrCGGCGATGGGGCT GGTGTACTrCGTCGTGclGA AGAAGTTCAGGCGGTACGGGCTTCCGATGCAGGTCCT:- 



-eGFPC.e.unc53xba 



T L T . ' G " ° C F S R Y P 0 H M K Q H Q F F K S A H P E G T y Q E 

CG.-ACCATCTTCTTCAAGGACGACGGCAACTACAAGAC C CGCGCCGAGGTGAA G TTC G AGGGCGACACCCTGGT G AAC CGCATCGAGCT G AAG GC CAT.:n 
GLuTGoTAGAA G AA G rTCC ^^G^^G A ^G^^^^ GGG C G C GG CTCCACTTCAAGCTC CCGCTGTGG G ACCACTTG G CGrAGCrCGArTTrrf GT^i~.~ ' 



-sGFPC.e.unc53xba 



^ r KQDGN YK T ft A £ V K F £ G D T t V N R I E L K G I 
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Fiq33pEGFPxba (1>5447) Siteand^g uenco 



Pags^ 



AC rTCAAGGAGGACGGCAACATCCTGGGGCA CAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATC ATG GCCGAC AAGCAGAAG 
TGAAGTTCCTCC ^^^^^G^GGACCCCGTGTTCGACCTCATGTTGATGTTGTCGGTGTTGCAGATATAG TACCGGC TGtTc^Ttttc 



AACGGCATCA4 



TfGCCGTAGTT 



OfKEOGNlLGH 



- eGFPC. e.unc53xba 



KLErNYNSHNV 



Y I M 



* D K Q K N G | 



GGTGAAC TTCA AGATCCGCCACAAC A 
CCACTTGAAG 



TCGAGGACGGCAGCGTGCAGCrCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCrrr.rnrr.,,. 



TrCTAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCGGCTGGTGATGGTCGTCTrGTGGGGGTAGCCGCTGCC^ 



' 1 • * 12GC 
GGCACGACGAC 



eGFPC. e.uncS3xba — — 
V N F K ' « .H N I E D G S V QLA0HYQ 0NTPI 



G 0 G P V L L 



CCCGACAACCACTACC 



TGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCAC 



ATGGTCCrGCTGGAGTTCGTGACCGCCGCCGGGA 




PONHYLSTQS 



* L S K 0 P N E K R 0 H M V L t E F V T A 



A G 



rCACTCTCGGCATGGACGAGCTGTACAAGTCCGGACTCAGATCTACGTrAAAT^TAn a ATT/i a 



jTGAGAGCCGTACCTGCTCGAC 



ATCTACACGGATTGGGCCAATCGGCACCTTTC 




ttCC 



TLGMOELY 



k sglrs tsnvel ip Jytq vanrhls 

GAAGGGCAGCTTATCAAAGTCGATrAGGGATATTTCCAArGATTTTCGCGACTATr,^ TGTGATCGTTCC 



CTTCCCGTCGAATAGTTTCAGCTAATCCCTATAAAGGTTACTAAAAnrnrrr, A T A .r 



TGACC AAAGAGTCG AATAATTACAC TAGCAAGGCTAGTTGC TT 



I 5CC 




" ' 5 K S ' » ° ! 5 " » ' » ° V » L V S Q L , ,., y , v p , H E 

TTCTCGCCTGCATTCACGAAACGTTTGGCAAAAATCACATCGAACCTGGATGGCCTCGAAACGTGTrTrfXArTa 



AAGAGCGGACGTAAGTGCTrtGCAAACCGTTTTTAGTGTAf;r TTftnArrTArrrr a*»»- 



r ACC TGAAAAA TC TGGGTC TCG AC TGCT 




sec-: 



S P A F TKRL 



*■ K ' T S N L 0 C ■- E T CLOYLKNLGLOC 



CGAAACTCACCAAAACCGATATCGACAGCGGAAACTTGGGTGCAr.TTr 



TCCAGCTGCTCTTCCTGCTCTCCACCTACAAGCAGAAGCTTCGGCAACTGAA 



GCTTrGAGTGGTTTTGGCTATAGCTGTCGCCTTTGAACCCACGTCAAGAGGTCGACGAG^GGACGAGAr^^T 



TCGTCTTCGAAGCCGTTGACTT 



1 70-: 
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Figure 51b 
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fig 52 pi.M5 ( 1 > 5425) Site and Sequence 98 
Enzymes : All 146 enzymes (No Filter) 

Settings: Linear. Certain Sites Only. Standard Genetic Code 



GACGGATCGGGAGATC TCCCGATCCCCTATGGTCGAC TC TCAGTACAATC TGCTCTGATGCCGCATAGT TAAGC CAGTATCTGCTCCCTGCTTG TG TGT7 

CTGCCTAGCCCTCTAGAGGGCTAGGGGATACCAGCfGAGAGTCATGTTAGACGAGACTACGGCGTATCAATTCGGTCATAGACGAGGGACGAACACACAA ' C ° 
T P R , E | g R 5 P M V Q S Q Y W L L . C RIVKPVSAPCLCV 

GGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTG CTTAGGGTTAGGCGTTTTGCG 
CCTCCAGCGACTCATC ACGCGCTCGTTTTAAATTCGATG TTGTTCCGTTCCGAAC TGGCTGTTAACGTACTTCTTAGACGAATCCCAATCCGCAAAACGC ^ 
G G R j v v R E Q N L S Y N K A R L DROLHEESA.G.AFC 



CTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGG TCATTAGTTCATAGCCCATATA 
GACGAAGCGCTACATGCCCGGTCTATATGCGCAACTGTAACTAATAACTGATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATAT ^ 
A A S R C T G 0 I Y A L T L I I D L L I VINYGVISS.PIY 

TGGAG ^^^^^^^^^CATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTC AATAATGACGTATGTTCCCATAGT 
ACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCA ^ 
G V P . ft Y 1 , T Y G K V P A V L T A Q R P P P I D V N N D V C S H 5 

AACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAA GTGTATCATATGCCAAGTACGCCC 
TTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTGATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTrCATGCGGG ^ 
NAN RDFPL TS MGG LFTVNC P UGS T S S V S Y A K Y A 

CCTATTGACGTCAArGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTAC A 

GGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGT 
P Y ■ • R Q ; R . ■ " I A U C P V H D L M G L S Y I A V H L R I S H 

TCGCTATTACCATGGTGATGCGGTrTTGGCAGTACArCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTC TCCACCCCATTGACGTCAA 
AGCGATAATGGTACCACTACGCC AAAACCGTCATGTAGT TACCCGCACCTAfCGCCAAACTGAGTGCCCCT AAAGGTTCAGAGGTGGGGTAACTGC AGTT ?C ° 
R V Y H G D A V L A V H Q V A V I A V |_ TGI S K S P P H . R 0 

TGGGAGT TTGTTTTGGCACC AAAATCAACGGGAC T TTCC AAAATGTCGTAACAAC TCCGCCCCAT TGACGCAAATGGGCGGTACGCGTG TACGGTGGGAo 
ACCC TCAAACAAAACCGTGGTTTTAGT TGCCCTGAAAGG TTTTACAGC ATTGTTGAGGCGGGGTAAC TGCGTTTACCCGCCATCCGCAC AFGCC ACCC TL' ^ 
V E F V L * P K 5 T G L S K M S . Q L R P I D A N G R A C T V G 

GTCTArATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACrGGCTTATCGAAArTAATACGACTCACTATAGGGAGACCCAAGC TGGC TAG- 
CAGATATATTCGTCTCGAGAGACCGATTGATCTC TTGGGTGACGAATGACCGAATAGC TTTAAT TATGC TGAG TGATATCCC TC TGGGTTCGACCGATC3 ^ 

^ I 

GL YKQSS iAN p TH CLLAYRN. YOSL . GOPSVLA 



TACGAC 



GTTTAAACTTAAGCTTACCATGGGGGGTTCTCATCATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAA TGGGTCGGGATCTG 
CAAAT TTGAATTCGAATGGTACCCCCCAAGAGTAGTAGTAGTAGTAGTACCATACCGATCGTAC TGACCACCTGTCGTTTACCCAGCCC TAGAC ATG ' 7 j 

I 

I— ProBond bindino domain -U 



FXL )CLT H GGSH HHHHHGMASMTGGQGMGRD 



L Y 0 
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fig 52 pLMS (1 > 5425) Site and Sequence 



Page ^ 



GATGACGATAAGGTACCTAGGATCCATGCAAATGAGGAGGAGGAGCCAGAGAAGAAGGAGGTATCGGAGCr GCGCrCTGAGCTATGGGAGAAGGAAATGM 
CTACTGCTATTCCATGGATCCTAGGTACGTTTACTCCTCCTCCTCGGTCTCrTCTTCCTCCArAGCCTCGACGCGAGACTCGATACCCTCTTCCTTTAC- 



3 



- pons \r\unn = uj 



0 0 D K V ? R 1 H A N E EEEPEKKEVSELRSELVEKEM 



AGCTTAC AGACATCCGCTTGGAGGCCCTCAAC TC7GCCC ACCAACTGGATCAGCTTCGGGAGACCATGCACAACATGCAGT TGGAGGTGGACCTGC TGAA 
TCGAATGTCTGTAGGC3AACCTCCGGGAGTTGAGACGGGTGGTTGACCTAGTCGAAGCCCTCTGGTACGTGTTGTACGTCAACCTCCACCTGGACGACT- 



■pLMS insert a U3 



-OHFU3 



KLTDIRLEALNS 



AHQLOGLRETMHNMQLEVOLLK 



AGCAGAGAATGACCGACTGAAGGTAGCCCCAGGCCCCTCATCAGGCTCCACTCCAGGGCAGGTCCCTGGATCATC TGCATTATCTTCCCCACGCCGCT CC 
TCGTCTCTTACTGGCTGACrTCCATCGGGGTCCGGGGAGTAGTCCGAGGTGAGGTCCCGTCCAGGGACCTAGTAGACGTAATAGAAGGGGTGCGGCGAGg 



- pLMS insert s U3 



-ORF U3 



A E N D R L K y A p g P SSGSTPGOVPGSSALSSPRR3 



CTAGGCC ^^CACTCACCCATTCCTTCGGCCCCAGTC TTGCAGACACAGACCTGTCACCCATGGATGGCATCAGTACTTG TGGTCCA AAGGAGGAAGTGA 
GATCCGGACCGTGAGTGGGTAAGGAAGCCGGGG7C AGAACGTCTGTGTCTGGACAGtGGGTACCTACCGTAGTCATGAACACCAGGT TTCC TCCTTCAC "* 



-ORFU3 



-GLALTHSFGP 



3 L , A 0TOLSPM0G I STCGPKEEV 



CCCTCCGGGTGGrGGTGAGGATGCCCCCGCAGCACATCArCAAAGGGGACTrGAAGCAGCAGGAATTCTTCCTGGGCTGTAGCAAGGTCAGTGGAAA^;: 



GCSASGCCCACCACCACTCCrACGGGGoCGTCCTGTAGTAGTTTCCCCTGAACTTCG TCG TCCT TAAGAAGGACCCGACA TCG TTCCAG TC ACC TT TTCi 



I L RVVV^MPPqh ! I 



-ORFU3 



KGOLKOQEFFLGCSKVSGV 



«"GAC TGGAAGA TGC TGG A TGAAGC TG TT TTCC AAG "3 TTCAAGGAC TATA TTTCTAAAATGGACCC AGCC TCTACCCTGGGAC TAAGCACTG AG TCCA TL* 
AC tgaccttctacgacc TACTTCGACAAAAGGTTC AC AAGTTCCTGATAT AAAGATTTTACC TGGG TCGGAGATGGGACCC TGATTCGTGAC tc AG gtau 



- pLM5 insert = U3 



" ■ -ORFU3 — 

° - V K . M L 0 £ A v ? Q V F * OYISKMOPASTLGLSTcS 



WO 98/24810 24/270 PCT/EP97/06956 
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(iq 52 pLM5 (1 > S425) Site and Sequence 



Page 1 



catggctacagcatcagccacgtgaa acgagtgttgsatgcagagccccccgagatgcctccttgccgtcgaggtgtcaataacatatcagtctccctch 

G7ACCGATGTCGTAGTCGGTGCACTTTGCTCACAACCTACGTCTCCGGGGGCTCTACGGAGGAACGGCAGCTCCACAGTTATTGTATAGTCAGAGGGAG- 



- pLMS insert a U3 



-OHFU3 



H G V S ( S H V K R V L 0 A E P P E M P PCRRGVNNlSVSl 

AA3GrCTGAAGGAGAAATGCGTCGACA6CCTGGTGTTCGAGACGCTGATCCCCAAGCCGATGATGC AGCACTACATAAGCCTCCTGCTGAAGCACCGGCc 
TTCCAGACTTCCTCTTTACGCAGCTGTCGGACCACAAGCTCTGCGACTAGGGGTTC GGCTAC TACGTCGTGATGTATTCGGAGGACGAC TTCGTGGCCGC 

-pLMS insert a U3 



-ORF U3 



*G l*EKC VPS lVF£T L I PKP MMQHY ISLLLKHRR 

CCTCGTCCTCTCGGGCCCCAGCGGCACGGGCAAGACCTACCTGACCAATCGCTTGGCCGA GTACCTGGTGGAGCGCTCTGGCCGTGAGGTCACAGAGGGC 
GGAGCAGGAGAGCCCGGGGTCGCCGTGCCCGTTCTGGATGGACTGGTTAGCGAACC GGCTCATGGACCACCTCGCGAGACCGGCACTCCAGTGTCTCCCG 

-pLM5 insert aU3 



-ORFU3 



L V L S G P S G T G K T Y L TNRLAE YLVERSGREV T E 

ATCGTCAGCACCTTCAACATGCACCAGCAGTCTTGCAAGGATCTGCAACTGTATCTTTCCA ACCTAGCCAACCAGATAGACCGGGAAACAGGAATTGGGJ 
TAGCAGTCGTGGAAGTTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGAC ATAGAAAGG TTGGATCGG TTGGTCTATCTGGCCC TTTG TCC TTAACCC> 

-pLMS insert a U3 



-ORF U3 



1 VSTFNHHQQ^SCK^DLQ LYLSNLANQIDRETGIG 

ATQTGCCCC TGGTGAT TC TAT TGGATGACCTG AG TG A AGCAGGC TCCATC AGTGAGTTGGTC AA TGGGGCCCTCACCTGCAAGTATC ATAAATGTCCC Tm 
TaC ACGG3GACC ACTAAGA fAACCTAC TGGAC TCACTTCGTCCGAGGTAG TCACTCAACCAG TTACCCCGGGAGTGGACGTTCAT AGTATTTAC AGGGA " 



- pLM5 insert = U3 



-ORFU3 



0 tf P L I 1 L L D 0 L S E AGS I S E I V N G A LTCKYHKCPV 

tat tataggtaccacc aatcagcctgtaaaaa tgacacccaaccatggcttgcacttgagc ttcaggatgttgaccttctccaac A ACGTGGAGCC agc> 

ATAATATCCATGGTGGTTAGTCGGACATTTTTACTGTGGGTTGGTACCGAACGTGAACTCGAAGTCCTACAACTGGAAGAGGTTGTTGCACCTCGG TCGG 



I IGTTNQPVKNT 



ORFU3 

PNHGLHLSFRML T F S N N V £ P .& 



WO 98/24810 25/270 PC17EP97/06956 
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Page «, 



AATGGCTTCCTGGrTCGTTACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCGACATCAATGC CAAC A AGG A AG AGC TGCTTCGGG TGC TCG AC TGGG TA» 
rTACCGAAGGACCAAGCAATGGACrCCTCCTTCGACCATCTCAGTCTGTCGCTGTAGTTACGGrTGTTCCTrCTCGACGAAGCCCACGAGCTGACCCATc 




N G F > V R Y I * * . * t- V E S D S 0 i N A N K g E L L R V L 0 V V 

C CA AGCTGTGGT ATC A TC TCCAC ACC T TCCTTGAGAAGC AC AGC AC CTC AG AC TTCCTCATCGGCCCTTGCTTCTT TCTG TCG TG TCCC ATTGGC A TTG A 
GGTTCGACACCATAGTAGAGGTGTGGAAGGAACTCTTCGTGTCGTGGAGTCTGAAGGAGTAGCCGGGAACGAAGAAAGACAGCACAGGGTAACCGTAACT 



-pLMS insert a U3 



-ORFU3 



PKLVVHL HTF LEKHS TS Ori ! GPCFFL SC P I G I E 

GGACTTCCGGACCTGGTTCATTGACCTGTGGAACAACTCTATCATTCCCTATCTACAGGAAGGAGCCAAGGATGGGATAAAGGTCCATG GACAGAAAGCT 
CCTGAAGGCCTGGACCAAGTAACTGGACACCTTGTTGAGATAGTAAGGGATAGATGTCCTTCCTCGGTTCCTACCCTATTTCCAGGTACCTGTCTTTCGA 2 " 




° F R T V r 1 D > V N N 5 1 I P Y L Q E G A K 0 G I K V H G Q k* A 
GCTTGGGAGGACCCAGrGGAATGGGTCCGGGACACACTTCCCTGGCCATCAGCCCAACAAGACCAATCAAAGCTGTACCACCTGCCCCCACCCACCGTGG 




-ORF U3 



A V £ 0 P V E 'J V ^ D T L P V P S A Q Q Q Q S K L Y H L P P P T V 

GCCCrCACAGCArTGCCTCACCTCCCGAGGATAGGACAGTCAAAGACAGCACCCCAAGTTCTCTGGACTCAGArCCTCTGATGGCCATGCrG CTGAAACT 
CGGGAGTGTCGTAACGGAGTGGAGGGCTCCTATCCTGTCAGTTTCTGTCGTGGGGTTCAAGAGACCTGAGTCTAGGAGACTACCGGTACGACGACTTTGA 



" pLM5 inssrt a U3 



-ORF U3 



° P H S 1 A S , P P £ ,0 R T V X D S T P S S L D S D P L M A M L L K L 

TCAAGAAGC TGCCAAC TAC A7TGAGTC TCCAGATCGAGAAACCATCCTGG ACCCCAACCTTC AGGC AACAC TT fAAGGGT TCGGCAA TC AC T GTCACCCC 
AGfTC TTCGACGGrTGATG TAAC TCAGAGGTCTAGCTC T TTGGTAGGACC TGGGGTTGGAAG TCCGTTGTGAAATTCCCAAGCCGTTAGTG ACAGTGGGG 



- pLM5 insert = U3 



-ORF U3 



_J 



o e 



A N Y 1 ESPORETlLOPNLOArL 



G N H C H P 
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fig 52 pLMS (1 > S42S) Site and Sequence Page 5 

CG3ACAGCAGAACGCTGGC ATCAGCTATCTTACC TCC TCCTCTCCCCTCTCCTCTTTCAGAGCACTGGCTC TCCAGCCCCAG5AGG AGAAC AGGAGGGA2 
GCCrGTCGTCTTGCGACCGTAGTCGATAGAATCGAGGAGGAGAGGGGAGAGGAGAAAGTCTCGTGACCGAGAGGTCGGGGTCCTCCTCTTGTCCTCCCT: " ' 



-pLM5 inserts U3 



R T A E ft V H Q L S LLLSPLLFQSTGSPAPG 



G E Q E G 



GAGGAGATGAAAGAGGAGG3ACAGGTTCTTGGTGCTGTACCTTTGAGAACrTCCTAGGAAGGAATGGTGGGGTGGC GTTTGGGAACTTGTGCCCCCTAAA 
CTCCTCTACTTTCTCCTCCCTGTCCAAGAACCACGACATGGAAACTCTTGAAGGATCCTTCCTTACCACCCCACCGCAAACCCTTGAACACGGGGGATT1 



300 



-pLM5 insert =U3 



O^RGGTGSVCCrFENFLGRNG 



6VAFGNLCPl.fi 



CACATTTACTGGCCTCCTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCA GCCATCTGTTGTTTGCCCCTCCCCCGT 
GTGTAAATGACCGGAGG^A GATCTCCCGGGCAAATTTGGGCGACTAGTCGGAGCTGACACGGAAGATCAACGGTCGGTAGA^ 3lC * 

— pLMS insert a U3 — ' 

T F T G L L R A ft L N PL I SLDCAF .LPA I C C L P L P R 



GCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAG 

cggaaggaactgggaccttccacggtgagggtgacaggaaaggattattttac tcctttaacgtagcgtaacagactcatccacagtaagataagacccc 32C ' : 

A F L D P G R C H S H C P F L I K . G N C 1 A L S E . V S F Y S G 

ggtggggtggggcaggacagcaagggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatggct tctgaggcgg 

CCACCCC ACCCCGTCCTGTCGTTCCCCCTCCTAACCCTTCTGTTATCGTCCGTACGACCCCTACGCCACCCGAGATACCGAAGAC TCCGCC TTTCTTG'jT" ^ 
GV GGAGQ QGG GLGRQ. q acv gcggly g f . g g k n q 

GCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGGCGCATTAAGCGC6GCGGGTGTGGTGGTTACGCGCAGCGTGACCGC TACACTTGCCAGCGC 
CGACCCCGAGATCCCCCATAGGGGrGCGCGGGACATCoCCGCGTAATrCGCGCCGCCCACACCACCAATGCGCGTCGCACTGjCGATGTGAACGGTCGClj ^ 
1 G L • G V 3 P R A L . R R I K RGGCGGY A QRORYTCQR 

CCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGC TCTAAATCGGGCC ATCCCTTTAGGGTTCCGA 
GGA "-GCGGGCGAGGAAAGCGAAAGAAGGGAAGGAAAGAGCGGTGCAAGCGGCCGAAAGGGGCAGTTCGAGAT TTAGCCCCG TAGGGAAATCCC AAGGC ^ ^ 
' S A R S F R F <- p , P I S R H V R R L S P S S S K S G H P F R V F 

T T7AG TGCTTTACGGC ACC TCGACCCCAAAAAAC T TGAT TAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACCGTTT TT CGCCC TTTGAC0 T 

AAA TC ACGAAATGCCGTGGAGCTGGGGTTTTTTG AAC TAATCCC AC TACCAAGTGCATCACCCGGTAGCGGGACTATC TGCC AAAAAGCG5GAAAC TGCA ^ 

1 • C F r * P. R p q * T L G V F T . V A I A L I 0 G F 3 ? F D V 

TGGAGTCCACGTTCTrTAArAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGG GGAT 
ACC AGGTGCAAGAAATTATCACC TGAGAACAAGGT 7TGACCTTG TTGTGAGTTGGGATAG AGCC AG ATA AG AAA AC TAAATATTCCC TAAAACCCC TA ^ 
C V H V L • V T L V P N V N N T Q P Y L G L F F . F I R 3 F G 0 

T TCGGCC TAT TGGTTAAAAAATGAGC TGATTTAAC AAAAATT TAAC GCGAATTAATTCTG TGGAATGTGTGTC AGTTAGGGT5 7GGAAAGrcCCCAGGC" 
AAGCCGGATAACCAAFTTT TTAC TCGAC TAAATTG TT 7T TAAATTGCGCT TAATTAAGAC ACC T TAC AC AC AG TCAATCCCAC ACC T T TCAGGGGTCCGA 
F ° L > V K K • AOL T :< | PEL I LWNVC0LGCGK5PC 



WO 98724810 27/270 PCT/EP97/06956 



Tuesday, 18 November 1997 13:56 Pa t 

fig 52 pLMS (1 > 5425) Site and Sequence 

CCCCAGGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCC CAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCA 
GGGGrCCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGT ^ 
SP GRQ K Y AK H A S Q L.V S N Q V VKVPRLPSROkYAVH 

TGCArCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGrTCCGCCCA TTCTCCGCCCCATGGCTGACT 
ACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGGGGATTGAGGCGGGTAGGGCGGGGATTGAGGCGGGTCAAGGCGGGTAAGAGGCGGGGTACCGACTGA ^ 
A S Q L V S N H S P A P N S A H P A PNSAQFRPFSAPVLT 

AATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCTGCCTCTGAGCTATTCCAGAAGTAGT6AG6AGGCTT TTTTGGAGGCCTAGGCTTTTGCAAAAAG 
TTAAAAAAAATAAATACGTCTCCGGC TCCGGCGGAGACGGAGACTCGATAAGGTCTTC ATCACTCC TCCGAAAAAACCTCCGGATCCGAAAACGTTTTT^" "'^ 
NFF Y lCftGRG RLC L . A I PE VVRPLFVRPRLLOK 

CTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATC AAGAGACAGGATGAGGATCGTTTCGCATGATTGAACAAGATGG ATTGC ACGCAGG TTCTCCGG 

GAGGGCCCTCGAACATATAGGTAAAAGCCTAGACTAGTTCTCTGTCCrACTCCTAGCAAAGCGTACTAACTTGrTCTACCTAACGTGCGTCCAAGAGGCC ^ 
* p . C ? L Y I H F R _ | • SRORM RtV SHD. TRW I A R R F S G 

CCGCTrGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGC TGTCAGCGCAGGGGCGCCCGGT 
GGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTACGGCGGCACAAGGCCGACAGTCGCGTCCCCGCGGGCCA ^ 
R > G , G E A . 1 R L . L G T T 0 N R L L C R RVPAVSAGAPG 

TCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGA CGGGCGTTCCTTGCGCAGCT 
AGAAAAACAGTTCTGGCTGGACAGGCCACGGGAC TTACTTGACGTCCTGCTCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTCGA ^ 
SFC QDRPVRC PE . T A G R G S A A I V A G H 0 G R S L R S 

GTGCTCGACGTTGTCAC TGAAGCGGGAAGGGACTGGC TGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCC TGCC GAGAAAg 
CACGAGCTGCAACAGTGACTTCGCCCTrCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGACAGTAGAGTGGAACGAGGACGGCTCTTTL ^ 
CARRCH ySGK GLAAl GR SAG AGSPV I S P C S C R E 3 

TA7CCATCATGGCTGATGCAATGCGGCGGCTGCATACGC TTGATCCGGCT ACC I^GCCC ATTCGACCACCAAGCGAAACATCGCATCGAGCGAGC ACGTAC 
ATA'jG TAGTACCGACTACG TTACGCCGCCGACGTATGCGAAC TAGGCCGATGGACGGGTAAGCTGG TGGTTCGCT TTGTAGCGTAGCTCGC TCG7GCAT3 ^ 
1 " H G ; C N A A A A Y ^ - S G Y L P 1 R p PSETSHRASTV 

TCSGATGGAAGCCGGTCTTGTCGAfCAGGATGATC TGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAAC TG I" TCGCCAGGCTC AAGGCG CGC ATGCC- 
AGCC TACC TTCGGCCAGAACAGC TAG TCCTACTAGACC TGCTTCTCGTAG TCCCCGAGCGCGG TCGGCTTG AC AAGCGGTCCGAG TTCCGCGCG 7 £CG(J1: ^ 
SDGSR SC R SG . S G R R A S G A R A S R T V R Q A Q G A H A 

GACGGCGAGGATC fCGTCG TGACCC ATGGCGATGCC TGC TTGCCGAATATC ATGGTGGAAAA TGGCCGC TTTTCTGGATTCATCGACTG TGGCCSGCTGG 

CTGCCGCTCCTAGAGCAGCACTGGGTACCGCTACGGACGAACGGCTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGCTGACACCGGCCGACC ^ 
R R R G S R R QP V/ RC L L A E Y H G G K V P I F WtHRLVPACi 

GTGTGGCGGACCGC TATCAGGAC ATAGCGTTGGC TACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGC TGACCGC TTCC TCGTGCTTTACGGTAT 

cacaccgcctggcgatagtcctgtatcgcaaccgatgggcactataacgacttctcgaaccgccgcttacccgactggcgaaggagcacgaaatgc:ata ^ 
^ggpls gh svgyp . yc.ravrr m g . p l p r a l p y 

CGZCGC TCCCG ATTCGC AGCGCAfCGCC TTC TATCGCC T TC T TGACGAGT TC T TC TGAGCGGGAC TCTGGGGT TCGAAATGACCG AC CAA3CGACGCCC A 
GCGGCGAGGGCT AAGCGTCGCGT AGCGGAAGATAGCGGAAGAACTGC TCAA3AAGAC TCGCC CTGAGACCCCAAGC T T TAC TGGC TGGT7CGC r 2C 
U S " F A A H R L L 3 P S . RVLLSGTLCFEHTOo'a'i'p 
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Tuesday, 1 8 November 1 997 1 3;S8 ^ 
fig 52 pLMS f 1 > 5425) Site and Sequence Pa 9 e ' 

ACC ^Q^ AT ^^Q^Q^^^^CGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGAC GCCGGCTGGATGATCCrCCAGCGCGS 
TGGACGGTAGTGCTCTAAAGC TAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACC TACTAGGAGGTCGCGCC ^ 
N L , P S Q 0 F . 0 S T .A A F Y E Q L G F G t V F R p A G y M I L Q R G 

GGATCTCATGCTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAA ATAAAGCA 

cctag agtacgacc ^caagaagcgggtggggttgaacaaataacgtcgaatattaccaatgtttatttcgtta tcgtagtgtttaaagtgtttatttcgi" 52C ' : 

D L M L E F F A H P N L F 1 a A Y N G Y K . S N S I T N F T N K A 

tttttttcactgcattctagttgtggtttgtccaaactcatcaatgtatcttatcatgtctgtataccgtcga 

aaaaaaagtgacgtaagatcaacaccaaacaggtttgagIagttacatagaatagtacagacatatggcagctggagatcgatctcgaaccgcattagta ^ 

F F S L H S 5 C G L . S K L J N V S Y H V C IP S T S S . $ L A S 
GGTCATAGCTGTTTCCTCTGTGAAATTGTTAra 

CCAGTATCGACAAAGGA^ WOC 
V S . ' L F P V . ; N C Y P L T I P H N I R A G S | K C K A y G A V 

GAGCTAACTCACATTAATTGCGTTG 

' ' 1 i sa 25 

CTCGATTGAGTGTAATTAACGCAAC 



S . I T L I A I 
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fig 53 pLM6 (1 > 4947) Site and Sequence Pa 9 e 
Enzymes : All 146 enzymes (No Filter) 

Sellings: Linear. Certain Sites Only. Standard Genetic Code 

GTGGCACrTTTCGGGGAAArGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGA GACAATAACCCTGATAA ft - : 

CACCGTGAAAAGCCCCTTTACACGCGCCTTGGGGATAAACAAATAAAAAGArTTATGTAAGrTTATACArAGGCGAGTACTCTGTTATrGGGACTAr-TA 



V H F S G K C A R N P Y L F I FLNTFKYVSAHETIT 



L I fl 



GCTTC AATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCC 



t --Ttcctgtttttgctca: 



CGAAGTTATTATAACTTTTTCCTTCTCATACTCATAAGTTGTAAAGGCACAGCGGGAATAAGGGAAAAAACGCCGTAAAACGGAAGGACAAAAACGAGT-- ^ 
A S 1 ' L * K I E Y, I V S T F P C R P V S L F C G I LPSCFCS^ 

CCAGAAACGCTGGTGA AAG |AAAAGATGCTGAAGATC AGTTGGGTGCACGAGTGGGTTAC ATCGAACTGGATC TC AAC AGCGGTAAGATCC T TGAGAG T7 

ggtctttgcgaccactttcattttctacgacttctagtcaacccacgtgc tcacccaatg tagcttgacctagagttgtcgccattc taggaactctc aa 330 

PRNAGES KRC . RSVGC T SGL HRTGSQ Q R . 0 P . £ F 
TTCGCCCCGAAG AACGTTTrcCAArGArGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAAC TCGGTCS 

aagcggggcttcttgcaaaaggttactactcgtgaaaatttcaagacgaIacaccgcgccataatagggcataactgcggcccgttctcgttgagccag" 

SPR R TF SNOEHF . SSA HVRG I I P Y . R R A R A T R S 

ccgcatacactat tctcagaatgac ftggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacagtaagagaattatgcagtgc tgcc 
ggcgtatgtgataagagtcttactgaaccaactcatgagtggtcagtgtcttttcgtagaatgcctaccgtactgtcatIctcttaatacgtcacgacgg 500 

" " T > F S E • L G . • V »■ T S H R IC A S Y G U H 0 S K R I M Q C C 

m>accatgagtgataacactgcggccaacttacttctgacaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatggggr 



atcatgtaa 
accccctagtacat? 
anrffaqhggscm 



TATTGGIACTCACTATTGTGACGCCGGTTGAATGAAGACTGTTGCTAGCCTCCTGGCTTCCTCGATTGGCGAAAAAACGTGTTGTA''-'--' S2 ° 

hnhe -"cgoltson o r r r e g 

CTCGCCTTGATCG TrGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACA ACGTTGCGCAAACr 
GAGCGGAACTAGCAACCCTTGGCCTCGACTTACTTCGGTATGGTTTGC TGCTCGCACTGTGG TGCTACGGACATCGTTACCGTTGTTGCAACGCGTTTGA 700 
S P - S 1 ° T G A E ' S H TKRRA . HHOA C SNGNN V A 0 T 

ATTAACTGGCGAAC TACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTrCTGCGCTCGG CCC T TCC2 
^ AATTGACCGCTTGATGAATGAGATCGAAGGGCCGTTGT ^AATTATCTGACCTACCTCCGCCTATTTCAACGTCCTGGTGAAGACGCGaGCCGGGAAGG" ^ 
NVR T TY3SFPAT I NRLDGG G . SCRTTSALGP > 



GCTGGCTGGTTTATTGC TGATAAATCTGGAG CCGG TGAGCGTGCGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCC TC CCGTATCGTA" 

CGACCGACCAAArAACGACrATTTAGACCTCGGCCACrCGCACCCAGAGCGCCATAGTAACGTCGTGACCCCGGTCTACCATTCGGGAGGGCArAG-ATC 
GVLVY C . . I VSR . AVVS RYH CSTGARV . al p y R .5 

TTATCTACACGACG6GGAGTCAGGCAACTA TGGATGAACGAAATAGACAGArCGCTGAGATAGGTGCCTCACTGATTAAGCATTGG TAACTGTCAGAC.-A 
AATAGATGTGCTGCCCC TCAGTCCGTTGATACCTACTTGCTTTATC TGTC TAGCGACTCTATCCACGGAGTGACTaaTTCG TAACCATTGACAG TC TGG7 ^ 

Y >■ H D G E S G N Y G . T K . T Q R . Q r c L T Q . A L V T V R P 
AGT TTACTCATATArACTTTAGArTGATTTAAAACTTCArTTTTAATTTAAAAGGATCTAGGrGAAGATCCTTTTTGAT AATCrCATGACCAAAATCr.-- 

tcaaatgagtatatatgaaatctaac taaattttgaagt aaaaatt aaat TTTcerAGArcc ac ttc t aggaa aaac tat tag ag tactggttttaggga ' '"' 

- L L ' ' T L ° • F ' T S F ■- ! ; * 0 L G E Q P P , , s H „ Q „ p 
rAACGTGAGTTTrCGrrCCACTGAGCGTCA GACCCCGTAGAAAAGATCAAAGGATCrTCrTGAGArCCTTTTrrTCTGCGCGr AArrTGrTGCTTG.AA, 

a "tgc ac tcaaaagcaaggtgac tcgcagtctggggc atct tttctag tt tcc tagaagaac rc taggaaaaaaagacgcgc a r t agacgacgaac gt t " :>: 

* 0 O R I FL R S F F S A R rl L L . A M 



L r ■ V FVPLSVRPRR 
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Tuesday. 18 November 1997 13:56 , 
fig S3 pLM6 (1 > 4947) Site and Sequence Pa 9 9 1 

CAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGC TACCAACTCTTTTTCCGAAGGTAAC TGGCTTCAGCAG AGCGCAGATACCAAA 

gtttttttggtggcga ^^^^^^ccaaacaaacggcctagttctcgatggttgagaaaaaggcttccattgaccgaagtcgtc tcgcgtctatggttt i: ' C< 

« * T TAT SGGLFAG SRA TN SFSEGNVUQ QSA0T> 



TaCTG TCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAAC TC TGTAGCACCGCCrACATACCrCGCTCTGCTAATCC TGTTACCAGTGGC TGC T 
ATGACAGGAAGATCACATCGGCATCAATCCGGTGGTGAAG^ 

Y C P . S S V A V V R , P P L Q £ L C S T A V I P P S A N P V T S G C 
GCCAG T6GCGATAAGTCGTGTCTTACCGGGTT6GACTCAAGACGATAG TTACCGGATAAGGCGCAGCGG TCGGGCTGAACGGGG GGTTCGTGCACACAGC 
CGGTCACCGCTATTCAGCACAGAATGGCCCAACCTGAGTTCT6CTATC AATGGCCTATTCCGCGTCGCC AGCCCGACTTGCCCCCCAAGCACGTGTGTCG 
C Q V R ; V V , S Y R , v G L K T I V T G . G A A V G L N G G F V H T A 

CCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACC jTACAGCCTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAG GCGGACAGGTATCC 
GGTCGA^CCTCGCTTGCTGGATGTGGCTT^ »6CC 
0 L G A N D L H R T E I P T A . A M R K R H A S R R £ K G GOVS 

GGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCT C TGACTT 
CCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCT^ ,7CC 

GKR QGRNR RA HEG ASRGKRL VSL.SCRVSPPLT 
GAGCGrCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGC CrTTTGCTGGCCTTTTG 

ctcgcagctaaaaacactacgagcagtccccccgcctcggataccttttIgcggtcgttgcgccggaaaaatgccaaggaccgga Iaa 

; ^ S ! ^ V M I V R G A E P M £ K RQQ RGLF TVP G L L L A F C 
CTCACATGTTCTT TCCTGCGTTATCCCCTGA7TCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGM 

gagtgtacaagaaaggacgcaataggggactaagacacctattggcataatggcggaaactcactcgacIatggcgagcggcgtcggctIgctgg^ ,Sa 

S H V L S C V I P . F C G . P Y Y R L , V S . Y R S P Q P IS, D R A 

cagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggcacgacag 

GTCGCTCAGTCACTCGCTCCTTCGCCTTCTCGCGGGTTAUcGTTTGGCGGAGAGGGGCGCGCAACCGGCrAAGTAATTACGTCGACCGiGCTGTCCAAA 
° P V - S £ R G S G R , A R N .T Q T A S P R A L A 0 $ L M Q L A R Q V 

CCCGACTGGAAA GCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATG T 

gggctgacctttcgcccgtcactcgcgttgcgttaattacactcaatcgagtgagtaatccg tggggtccgaaatgtgaaatacgaaggccgagcatacI :K ' : 

$ * L E S G °. ; A Q R w . C E L A H S I G T P G F T L Y A S G S Y V 
TGTGTGGAATTG ^ g a gcgg at aac aa TT TC AC AC AGG AA AC AGCTATG AC CATGATTACGCCAAGCGCGC a at taaccctcac TAAAGGGAACAAAAGCT 

acacaccttaacac tcgcc tattgttaaagtgtgtcc tttgtcgatac tggtactaatgcggttcgcgcgttaattgggagtgatttcccttgttttcga 22C * 

V V N C E R . 1 T ! S H R K Q L . P . L R Q A R N . P S L K G T Y. A 

gggtaccgggccc cccctcgaggtcgacggtatcgataagcttgatatcgaattcctgcagcccgggggatccatgcaaatgaggaggaggagccag^^^ 
cccatggcccgggggggagctccagctgccatagctattcgaactatagcttaaggacgkgggccccctaggtacgtttactcctcctcctcggtctcI 



GYRAPPrgrr 



• U3 stuk 



Y R • * -YRIPAARGIHANEEEEPE 
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Figure 43 



WO 98/24810 32/270 PCT/EP97/06956 



dond«mao. 27 oov«mt*/ 1997 t*:40 - 
R0 «« Map <t > 10122) Site »nd Sequence **°° 

Cnrymej . AI) 14« er\zy;n*s {No Filer) 

Settings : Cttculnr. Cartalo S<as Ony. Sj&nflarg GgfWfo Coae 

ATGACCATGATTACGGCAAGCTTGCA rCCCTCCAC OAATTCC ATATCAA6CTTATC C ATACCCTCG A CCTtG&CCATCACAACAAA TTGGAGCAACTACC 100 
CACATCCATTATGCCACCCf>CGGTTTCTAAGTGA<rrTT ^ 

CTCTATGACTAui ivji I GA<HG A rnTTCATTCAC AAAAT ATTAAAAG GAACA TTATTTA CTTTGCTT AT^GCCCTMCnTGATrrACTTTTTCGATC 300 

aactagatcttacaamcttgcaatacaattccattttcawttaccctcgccacgtgtcgcc^ ^ 
AorrccACAAAT(rrcAACATcuGGcrncAGACTcau(rrcM(^TAKQ sea 

CAGMAAGAAGATGATAA>AATGTAGTrTTTrrGCAAAACTTCCACCTTTATTGC^ 608 
AAATTTTCCACTATACAAA f<TrAGAAAAGTATTTTGCACAAAriTTGTCAGTrGACAGC^ 700 
AGTGGAAGGAGCGCAAGTCTATACTGGAAATAATG^TCrGAAACAA^TTTGTCCTATTCTCAAATGT^ 800 
CACTAGTTTCA6AACCTTTCCTTTTTGTATGA 

^CCAAm ACAAATCGC CTCCCOTGCC C G AAAAGTGCCCACCAAAATC AATTTCTC CGCTTCATAATGACTTTT AAATTG&TGTGAGAAAACACAGAAG 1009 
AGGCTAACTAAATTClACAGGCACAGGrrGTCCCTCTTCTCCCTCCTTCTCCCGCCTCCTCCTCCCff UC3 
TTCTCCATTTTGCTT ATAAAC ATTTCTrGTGTGGAAGGAAACTA CACGGGG AGACGGTCAA ITAA 7TCGAATGAG AGCATGGCAA TTAC7 CTTTCGGAAAT 1200 
T(ATGJUTAAAGATA(UGCaUU<aaaGGCT^ 1300 
CATaCTGACMTTAATGTCGGGTTTTATCCCaCmCrrArrCCaCACTCATTCTG^ W00 
AmATrn-GATATTTAATTrrGTCCAATTA&GG^TAAACACGALT^ 1500 
TTCAAAAAA TCAATAAATATTCC CTAAC AAATTCT ATGG<TAA AATTTTATTC 1600 
ACCTTGAATCCCCCA^(TrrATAGGMGCTCCCTGTCACATTTCCCATGCTATGAATCGQACTCAGCA 1700 
n'ATICGGCACGCGTAATAAACTCCA/GCACTTAGAATTTTAATTC^ 1800 
rrrTCCATGCTTTTTGGCCCATTAAAAAACTTTCTCACCrCTTCATCCATCTCAOCGTATCA 1900 
AGAAGGAGATACTGjlGCCACATGGCGTCTGACCCTTr^ 2080 
CTTGTTTTCTTCTTCGTTrrGACTCGCGCCTATTTTr^ 2X00 
AAAAACGAGGCAATTTAAAAATATTAAAA TTAATGAG G7TGTAGATGTA6 A TTTG GAAAaGAAGAAAAAAACAAAACAAATAGGAA CCGCCAGATCAAAA 2200 
TTCTATTTAAAGGTTTTCAAGATC^TTAGGCAAGATTCGGCTGAACAGAA 2300 
ATCa AAGCaGXAaATAGCCTTATTCTAG ATmA<rrrGCaATAAGCTCAAGCCCAACCAGA>ATU 2400 
GCTTGCTTCAGTCTAATCCAG ACTAGATTrCCftJUUGAGTTnGUTTTTAiU rGTTTCCA O fTTCTTGI T ACTTAAAATCTTAATGCCCTCTGATGCGT 2500 
AAAATCGTTATCCCTTTCTCTCACACTTTCAATTACAGATTCATCAAAGATTGGTATCA^ 2600 
ACTICATf^AATAATACAAATTCATTCCGTCCGTCGACCCGT^ 2700 
TCCGATCCTTCCGGCTTCTTTTTAGAAATTATATTiTTTCAG 2£00 
AAAACCUCTAGACCACAAACCCAGCTAGTTCCT^ 2500 
CTTGCTTCTGTGAAfiACTATTGGAGCA>AA£AAGAG<CCGATAA^ 3000 



WO 98/24810 33/270 PCT/EP97/06956 



oondaroso. 27 Aortrrtmr 1997 16: da 

HQ 44 PMP9 Map ( 1 > 10122) Sft4 and Seouanca m 3 



3190 



^CTOCTCATCGAATAGCCUCAACCTACCAGA^ 

<^KCGccacaGTMGaGGiuA<ra^ 32fle 

OAWaCGAMCCATOTCAAACACCACT^CAAGACTCCGCATAC^ JJ08 

ccatccattccacatcttccaacacttcmcctcawc<^aactctccctcatcacaccatc^ J4W 

gcccatagccccaacacccc7ttctccaaatattatcaacaaccctctt6ag<3aaaaaccaac^ 35W 

(ATcucacacatmrcGCCACCTcucAcc^^ 36e0 

ctcaaamccagaacctgaaaagctccaatcaatgagcatcgacaccacggacgttccaccccttccacctc^ 37a0 

TTCAATCCCACAACCACCAACCTACGATGTTOT^ MOe 

GACTCCATTCTGGOCATGCGTCGGCTCAGCTrGAOCCGCCGACAAAAACTTCTGGTAATCATTCGCTG J9a0 

AftTCCAGCG<^ACAC(TCTGACGCCGacrrTGCaTGTGCGCCA>AATGAGCWC^ 4^3 

OATCCTGACMCrrCGAAGACAGTTCCTCCTTGTCGTCT^ 41W 

ATGGCAACAGTCCCCTCCAAACATAGCWCTArrCCCACTTTGTTCGCCATCCCACGTCTTCnCCT 42 0& 

CA(nCGATTCTCGATCTCGAGCAGAACAGGACAATCT<n"ACAAACTTCTGTCCCAGTGCC6AAC(UCCCAACGT(% AM 

ACAAUTTCGaAAUTCCCCGCGATAaCATCCTATTCTCaa 44** 

CGACCrrcrTCACAAAAACCAAGCTATTCAGGCCAATTTCATTCACTTGATCGfAAATGCCACCT^ 4500 
CTCTCTTGAGCCCCA GAC G GGTG CCG AACTCGA rCTCGAAATATG ATTCTTCAGCATC CTACT C GG CGCvTTTCCC GAGCTGGAAGCTCTACTGGTA7CTA 460e 
TG GAGAGACGTTC CAACTGCA C A GA CTATCC GATGAAAAATCCCC CGCA CA TTCTGCC AAAA GTG AGATGGGATCCC AA CTATCA CTCGCTAGCAC GACA 4700 

OATATGWTaCTCAATGAGAACrrACGAACATGCrATTCGC^tATGCCACGTGACr 48 00 

AGGAGAACT ATGGAGCATTGTTTGATCTTTTTGAGCAAMGCTTAGAAJ^ACTCACrCAACACATTGATCGATCCA^ 430G 

attcagccaggacattcctcatttgagggatattakaatcat^^ seee 

TCTCTG^TCAtnTGaTCCCATCCATCATCGATCTaTCGTCGKGA^GCAGCAAKAGGAGAA SW 0 
AGAGCTGGATCCGCTCCTCA CTCTCC AAGTTCACCAAGAAGAAGAA CAAGAACTACGACCAACCA CA fATGCCA T CAATTTCCGGA T CTCAAG GAACTCT S283 

TGACAACAn'GATGTGATTGAGTTGAAGCAAGAGCTCAAAGAACGCGATA<rrGCACTTTACGAA<n'CCG 53B0 

^TOTCrUGGGAGACACTGAJlCAAGn(^ACCGA(^CAAaAATTAAAGAA^ 54M 

CCCGCGCCTCAATTCCAGTTATCTACGACGATGAGCATGTCTATCATGCAGCCTGTAGCAGTACATCAGCTA^CAATCrr 55W 

CAACTCAATCAAGGTTACTGTAAACGTGGACATCGCTGGAGAAATCAGTTCGATCoTTAACCCGG^ S669 

ACCAGTCAGTCATGCTGGAAAGACArrGATGTTTCTATTCTAGGA 57W 

CTCCTGATrCTATCCTTGGCTATCA^TTGGTGAACTTCGACGCGTCATTGGAGACTCCAC S860 

CTCAACTACAAfCCGAATGrrCATGCAC6(rrGCCGCAaGAGTCGC<rrA6ACAGTaGGrCar<WTATGCTT $960 

GTCAAGTCAATTrrGACAGACAGACCTCTGGTGTTAGCTGGAGrJSACTGGAATTGCAAACAGCAAAtT^ ftMd 



WO 98/24810 34/270 PCT/EP97/06956 



tfotfonug. 77 ncwTtoer 1997 i8:4a 

fin 44 pNP8 Mao O > 101 22) Sfta a*fl Stqua 



A7GCA 6S00 
XCT 6606 



GAACAAATtAATCCGAAGATAG rATTGTTAATATCAGCATTCCTGAAAACAATAAAGAAGAATTGC^ 6 1B0 
AA GCAAAGAATC A rGCATCGTAATTCTAGAT AATA TCCCAAAGAATCCAATTGCATTTCn'CT ATCCCTTTTTC CAAATGT C CCACTTCAAAACAA CGAA 6? 09 
GGTCCATTTGTACTATGCACA GTCAACCGATATC AAATCCCTG AGOTGi AATTCAC CAC AATTTCAAAA TGTC ACTAATGTCGAATCCTCTCGAAGGAT 6580 
TCATCCTACC^ACCKCGACGACGGGCGCTAGAGGATGAGTATC^ ^ 
AGCTCTrCAG&CCCTCAATAATTTTArTGACiAAAACGAATTCTGTTGATGTGACACr^ 

TCCCGTGAATGGTrTCATTCGATTGTGGAATGA GAACTTCATTCC ATATT7GG AACCTGTTGCTAG AG ATGG MAAAAAACCTTCGCTCGCTGOICTT^ 
TCGACGATCCCACCGACATCGTCTCTAAAAAATGGCCGTGCnTCCAT g7d0 
GTCACCTGCCAACTCATCC CGAC AA CACTTCA AT CCCCTC CA CTC GTTGATCCAATTG C ATGCTA CCAAGC. A TCAGA CCATCGACAACAT7TGAACAGAA 6&03 
GACTOAATCTTCTCTCGCCTCrCCCCCGCTTTCCTTATCTTCGTACCGCT 6980 
CCATCrCGCGCCCGTGCCTCTGACTTCTAAGTCCAATTACTCTTCAACATCCCT ACATGCI'CTTTCTCCCTCrGCTCCCACCCC CTATTT7TGTTA TTAT 7090 
CAAAAAAACTTCTTCTTAATTTCTTTGTTTTnAGCrT CTTTTAAGTCACCTCTAACAATGAAATTGTGT AGATTCAAAAATAGAATTAATTCGTAATAA 71W 
AAA GTCGAAAAAAa TT GTGCTC CCTCCC CCCATTAATAA TAATTCT ATCCCAAAAT CT ACAC AATGTTCTGTGTACACTTCTTA TGTTrrTTTTACTTCT 7Z09 
CATAAATTTTTrrrClAAACATCATAGAAAAAACCGCAOiCAAMTACm 730a 
TCTGGGCCTCTC ATG ACCTC AAA TCATGCTCATCCTG AjU AA (jnTTGGAGTATTTTTGG AATTTT^ GT GAAAGTTTA T C AAA TTAATTTTCC 7400 

TGCTTTTGCTTTTTGCGGCTTTCCCCTA^ 75 M 
AAGATCCGAAGMGCrTTCGGITTGAGGCTCUCTGGAAGGTGACTAG 

CAGAATAWTTCCCAATATACCAAACATAACTGTTTCrTACTAGTCGGCCCrrACGGGCCCTTTCGT 7700 
GACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCCGATGCCGGGAGCAGACAAGCCCGTCAGCGCGCGTCAGCCGGTGT^ 7*00 
GOGCTGGCTTAACTATGCCGCATCAGAGCAGATTGTACTGAGAGTGCAC^ 7900 
ACGCCGCCTTAAGCCCCTCGTGATACGCC7ATTTTTATAG 8009 
CGCGGAACC CCTATTTGTTTA TTTTTC TAAATAC ATTC AAA?ATGTA rCCGCTCATGAGA C AA 7AAC CCTG ATAAAT GCTTCAATAA TA TTG AAAAAC GA 9189 
ACACTATGAGrATTCAACATTTCCGTCTCGCCCTTATT^ B2M 

A(UTGCTGAAGATCA(nTGGGTGCACGAGTGGCTTACATCGAACTGCATCTCAACAGCGGTAAGATCCTTGAGA(^^ 8300 
ATGATGACCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCWTArrGACGCCGGCCAACAGCAACTCG MOO 
ACTTGOTTGAGTACTCACCAGTCACAG A AAAGCA TCTTAC G GATGGCATG ACAGT AAG AGA ATTATGCAGTGCTGCGITAACCATGACTGA T AACACTGC 8S09 
GGCCAACTTACTTCTGACAACCATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACA 8600 
GAGCTGAATG^GCCATACCAAACGACGAGCGTCACACCACGATGCCTirrAGCAATGGCAACAACGTTGCGCAAA 8700 

tagcttccccgcaacaattaatagactcgatgcaggcggataa^gttccaggaccacttctgcgct^ itoo 

ATCTGGAGCCGCTGACCGTGGGTCTClXGtn'ATCATTG agc8 
GCAACTATcXATGAAXGAAATAGACAGATCGCTGAGATAGGTGCCT 9000 



WO 98/24810 3S/270 PCI7EP97/G6956 



canOcrtftg, 27 nov«mhtf isg? ib:43 Paq . 

flo 44 pNP9 (1 > 10122) Sgg ano Sequence ^ 

TTGATTTAAAACn"CATTTTTAATTTAAAAGGATCTA^^ 91flfl 

AGCGTCAGACCCCGTAGAAAAGATGUAGGATCTTCTTCAWTCC 1 1 1 1 M I CTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCACCC 9290 

CTGGT I rCl 1 1GCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCACCAGAGCC CA GAT AC C AAAT ACTGTCCTT CTAGTCTAG CC CT 9300 

AGTTftGCCCACCACTTCAAGAACTCTGTAGCACCGCCTA^ 9460 

TACCGCGT7GGACTCAACACCATAC77ACCGCATAAGGCGCAGCGCTCGCG^ 9S00 

accgaactgagatacctacagcgtgagcattgagaaagcgccacgc^ 9660 

GAWCCCCACGAGGCAGCTTCCACaW^CGCaGCTATCTTTATACTCaCTCGW 9780 

CTCAGGGGGttGGAKaATWAMAACGCaGCAA^ 9*80 

TCCCCTGATTCTGT&GATMCCGTATrACCGCCTTTCACTW 9980 

CG(^UOCGCCCAATACGCAAACCGCCTCTCCCC(*^ 1000C 

GCGCAACGCAATTaaTCTGACTTAGCTCACTCATTAGGCACC^ iei0€ 
CAATTTCACACAGGAAACAGCT XdlZZ 
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figure 45 
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Figure 46 
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ftp a& PNP« Mao (1 > 1264 1) Sif and Scowcnco P «8* * 

GA nTTCA GACTTGG G CTA T AA ATTTTTGTCA AA A CTAGGAA TCTTAAAA TATTTGT'ATTTTTCGAA GA A TGCTC CTCAA T CTCAAATTCATA TTTTAT A 6100 

*TTO<ACCCCCTGATATaCAAAAA^ 6200 

^<™CAmAJ*AAT<acmAATCCOA^ ^ 
^CCTCCGCCTCTGAAGaCTCW^^ 

<KAA>GAATAAGACATUUATCCAKGGCTA(>ca^ €5 ^ 
CTCGTCGAGCACAGAACGGCTATCCrGACAArTTCGAA^C^CTTCCTCCTTcrrCCTCTC^ 

CCAmCTCCCGACTAGACATGGCAACACr^ 6?w 

ccaCTCc<rrcCTccACATCA<rrc<un^ 6M0 

aGCCACaCWCCrrCGGACAACAmGCruWTCCCCGGUTACTCATCCTATTCTCCAa 69W 
GCACTCACAGACTACTCGACGACCTTCTTCACAAAAACCAAGC^ ^ 
ACC^GCA^CAATGGCGGCTCTCnGAGCCCGAGACGWTCCCGAACTCGATGTCGAAATATGATTCT^ 7X W 
GAAGCTCTACTGCTATCTATGGAGAGACGTTCCAACTGCACAGACTATCCGATGAAAAATCCCCCCCACATT^ 7ZW 
AfCACTGGCTAGaCWtAQCATATGCATCr^^ 73 W 
GACTCACT AAXCAA<iAAA CAGGAGAACTATGGAGCATTGTTTWTCTTTTT GACCAAAACCTTAGAAAA CTCACCAACA CATT GATCCATCCAACTTGA 74Q9 
AGCCTGAA<ttGGCAATACGATTCAGGCAGGACArTGCTCATTTGAGGGATATT^^ 7 5TO 
TGAGCTTCTTC6TCAAC CA TCT CTCG AA 7CA GTT GCATCCCATCG ATCATCGATGTCATCGTCGTC GAAAAGC AGC AAGCAGGAGAA GATCA C CTTGAGC 768* 
TCCrrTTGGCAAGAACAAGAAGAGCTGGATCCGCrCCTCACTCTCCAAGTTCACCAAGAAGAAGA^ 77» 
CCG WTCTCAAGGAACTCTTGACAACATTGATGTGATTGAGriGAAGCAAGA GCTC AAAGAACGCGATAGTGC A CHTAC GAAG7CCGCCTTGAC AATCT 7UX> 
GGATCCrTGCCCGCGAACTTGATGTTCTGAGCi^GACAGTGAACAAGrrGAAAACCGAGAACAAGCA^ 7HB0 
CCAGCCACTCGTGCn'CTTCCCGCGCCTCAATTCCACTTATCTAC<UCGATGAGCAT£rrCTATtS^ 1399 
CGAMCGATCCTOGGCTGCAACTCAATCAAGCTTACTGTAAACCTGGACATCGCTGGAGAA^ 5100 
AGGATATCTTCCCATGTCAACCAGTCAtrrCATGCTGGAMGACATT 82|J0 
CATCAACTTGGAATCGATGCTCGTCATTCTATCCTTGGCTATCAAATTGGTGMCTTCCAC^ S3C8 
CAACTGAWTTCTTACTTCCTCAACTACAATCCGAAT um 
GCAA4TUTTaCCA>aCGTCAAGTOATm<iACAGA(UWCGTCTG(rr(r 

G CTGCTf ATGTATCT ATTCGAACAMTCAATCCGAAGATAGTATTCTrTAATATCAGCA^^ g$0 9 
GCCTGGAAAAGATCTTGAGAAGCAAAGAATCATGCATCCTAATTCTAGATAATftTCCCAAAG^ #7W 
CCCACTTCAAAACAACGAAGGTCCATTTCTAGTATGCACACT 

TCGAATCGTCTCGAAG GATT C ATCCTAC6TTACCTCCCACGA CGGKGGTAGAGGATGAG7A rCGTCTAACTGTACA(lATGCCATCAGAGCTCT7CAAAA 8980 
TCArTGACTTCTTCCCAATACCTCTTCAGGCCGTCAATAATTTTATTGAGAAAACGA^ c* M 



WO 98/24810 42/270 PCT/EP97/06956 



<HJrtd#**9. 27 rtovcjnbcr 1»7 16 46 



TCCTaAACTCTOKTCGATCCOT^ ^ 

ncGCTCCCTccAcmcrTcwccATcccACCGAarcaaaAAAAM 92e0 

AAaCCAAtUCaCCTCCCCTCACaCCUACT^^ 

COACAACATTTCAACA(^AGACTCTAATCTTCTCTCGCCTCTCCCCCGCT^ 9400 

C€CCCCTCrCATCACATCCCCATCTCGCCCCCGTGCCTCTGACTTCTAAGTCCAATTAC^ 9W9 

CCCCtatttttgttattatcaaaaaaact7cttcttaat^ 9We 

TAGAATTAArrCCTAATAAAAAGTCGAAAAAAATTGTGCTCCCTCCCCCCATT« 9700 

™<mrrTTTTACTTCT^ mw 

*n™TTTCTTCGCACCTm^ ^ 
TTTATGAAATTAATTTTCCTG^TTTTGCnTTTGW 

rGATGAGCACGATCCAAGAMGATCGGA^GAAGGTTTGGCTTTGAGGCTCAGTGGAAG Mlw 

GGGTTmGCC7TAAATGACA<^TACATTCCCAATATACC^CATAACTGrTTCa 182ec 

catgacggtgaamcctctgacacatccacctcccgga^ 

C0GCrrGTTGGCG«TGTCGGGC^TGGCTTAACTATGCGGCATCAGAGCA(U7T^ 1O40e 

aagxiagaamtaccgcatcaggcwccttaagggcctcctgatacgcctatttttatac l0S e* 

CACTTTTCGGiiCAAAT^GCGCGGAACCCCTATTTGTT^ATTTrrCTA^ lwce 

CAATAATATTG^AAAAGCAAXJAGTAfGAGTATTCAACATTTCCGTGTCGCCCT x07o < 
AAACGCTG<rTG*AAOTA^AGATGaGAAG 

CCCCGAAGAACCTnTTCCAATGAT(UGCACTTTTAAAGT7CTGCTATGT MW 

ATACAaAT7(TCAGAATCAOTGCTTGAGTAaCACCACrCAC^GAAAACCATm Uoe ? 

CCATWCTttTAACACTGCGG^^ Uie? 

ccttgatcgttgggaaccgcagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcc^ U20€ 

actggcgaactacttactciagctt > cccggcaacaattaatagactggatggagg<g U3W 

GC7CGTT7ATT60 , GATaaATCTGGAGCCGGTGACCGTGCGTCTCCCGGTATCATTGCAG<ACTGGGGCCAW 

CTACACGACG<U*GA<jTCAGGCAACTATGGAT(^CGAAATAGACAGATCGCTG>GATAGG7GCCTCAC^ U5 W 

TAOCATATATACrTTAGATTGATTTAAAACnCATTTTTAATTTAAAAG&^ U6W 

GTGAGTTTTCGTTCC ACTGAGCvTCaGAC CCCG7AC AAAaGATCAAAGGAI CTTCrrGAGATCCTTTTTTTCTGCOCG^ U7<8€ 

AAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGCATCAAGAGCT'ACCAACTCTTTT^ u8W 

CTC(TrnrAGTCTAvCCGrACnAGGCCACCACTTCAAG/JCTCTGTAGCACCGCCTAC^ ll9d8 
GT6CCGATAACTCGIC7C7TACCGGCTTGGAO"CAAGACGATA<rrTACCGGATAAG<*CCCACCGGTCGGG 



WO 98/24810 
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PCT/EP97/06956 



tfoodvtfag. 87 nownosr 1H7 1«-46 

SB 35 pWP» M»p (1 > 12SA1 t gitg and S«ou»nc» Past < 



^«"CG»AC«CCTACACCGAACTOAtUTACCTACAGCG^^^ ^ 

**^«^CG6MCAG<UGAGCCaCGACGGACCTTO^^ 

<™^6TGATCCTC6TCAC«GGGC^^ 

CATCTTCrrrCCT6CCTTATCCCCT<UrrCTCTGGATAACCOT*TTACC6CCTTTCA<rTW(CTCATACCCCT 124W 
"CT«CT«GCGACGAA«GGAAGAGC6CCCAATACG<AAACCGC^^ 



ACraUAAGCGGGCAGT<yiGC6CAACCCAATTAATCTGA<rn'AGCTCACTCATrAGGCACCCCAG6CTTTACA 
TGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCT 12641 



GGTTTCCC6 12SBC 
12608 
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Figure 36: Accosiation of C. elegans UNC-53 (expressed 
from pTB72) with rhe microtubular cytoskeleton of HepG2 
cells. (A) microtubules stained with YL 1/2 antibody to 
tubulin and (B) Staining for C. elegans UNC-53. 
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Fgiure 37: Microtubule (+)-end binding of C, elegans UNC-53 
following traneint transfection with pTB72 of HepG2 (a), MCF7 
(b) and Cos cells (c). C. elegans UNC-53 was visualised by 
immunofluorescence using mab 16-48. 
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Figure 38 
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Figure 39 
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Figure 41 



WO 98/24810 



50/270 



PCTAEP97/06956 




figure 42 
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Tuesday, 18 November 1997 1 1;48 
liq 34 pLM4 (1 > 10070) Site and Sequence 



Pagel 




-ORF pLMI 



' ° F . L K K . K » S E » 0 A y I Q G A L N A S E T T P K E L 3 I K R 

CAAAACTCCTCAGA TAGCATCTCAAGCCrCAACAGCATCACTAGCC ATTCCAGCATCGGCAGCAGCAAGGATGCTGATGCGjXAAAA GAAGAAAAAAAAGA 
GTTTTGAGGAGTCTATCGTAGAGTTCGGAGTTGTCGTAGrGATCGGTAAGGTCGTAGCCGTCGTCGTTCCTACGACTACGCTTrTTCTTCTTTTTTTTcl ^ 



0NS 50S [ 5 5 L N 5 1TSHSS I SSSKOADAKKKK K k 
GTTGGGTCTATGA GCTTCGAAGTTCCTTCAACAAAGCGTTCAGTATAAAAAAGGGGCCCAAGTCAGCTTCCTCATACTCGGATAT AGAGGAGATTGCTA" 

caacccagatactcgaagcttcaaggaagttgtttcgcaagtcatattttttccccgggItcagtcgaaggagtatgagcctatatctcctctaacgat^ " C ' : 



-ORF pLMI 



3 " V ' E L * S S F " * « F f ' K K G P K S A a S Y S 0 I £ £ I A T 




Ga'GTCACCGAGGGCCCTGCTCACCCAGCCCCCC ACACT AGGCTGTTCCATGCAAATGAGGAGGAGGAGCC AGAGAAGAAGGAGGTA TCG3 AGCTGCG'"" 
: ..ACAGTGGCTCCCGGGACGAGTGGGTCGGGGG GTGTGArCCGACAAGGTACGTTTACTCCTCCTCCTCGGTCrCTTCTKCTCCATAG.::rCGACGC.; 



-insert pLMI 



-OHF pLMI 



V T - E 6 " * H P A P H T R L F H A N E £ £ E P £ K It E V 3 E L R 
C ^G AGCTATGGG AGAAGGAAATGAAGC t taca gac atccgct tggaggccc tcaactc tgccc ac c aac TGGA TCAGC TTCGGGAGACCATGCACAAC A7 

G AC TCGATiiCCC Tr TTTTT TT*(" TTr-r ± * -r^-i- ' ''I'll 1 ' I U 



£ L V 



-ORF pLMI 



£ K 6 M K L 7 0 1 R »- E A L N 5 A H Q L 0 Q L R £ T , H N V 
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Tuesday, 18 November 1997 11:48 
fig 34 pLM4 (i > 10070) Site and Sequence 

GCAGT rGGAGGTGGACCTGCrGAAAGCAGAGAATGACCGACTGAAGGTAGCCCCAGGCCCCT CATC AGGCTCC ACTCCAGGGC AGGTCCCTGGATCA7C7 
CGTCAACCTCCACCTGGACGACTTTCGTCTCTTACTGGCTGACTTCCATCGGGGTCCGGGGAGTAGTCCGAG GTGAGGTCCCGTCCAGGGACCTAGrAGA 

-insert pLMl 



Pagetf 




0 > £, v 0 LL K A E N D R L K V A PGPSSGSTPGQV=>GS 



GCATTATCTTCCCCACGCCGCTCCCTAGGCCTGGCAC TCACCCATTCCTTCGGCCCCAGTCTTGCAGACACAGAC CTG TC ACCCATGGA7GGCATC ACTA 
CGTAATAGAAGGGGTGCGGCGAGGGATCCGGACCGTGAGTGGGTAAGGAAGCCGGGGTCAGAACGrCTGTGTCTGGACAGTGGGTACCTACCGTAGTCAT 




-ORF pLMl 



ALS SPRRSLG LAL TH SFG PS L AO TDLSPMDG I S 

CTTGTGGTCCAAAGGAGGAAGTGACCCTCCGGGTGGTGGTGAGGATGCCCCCGCAGCACATCATCAAA GGGGACTTGAAGCAGCAGGAA7TCTTCCTGGG 
GAACACCAGGTTTCCTCCTTCACTGGGAGGCCCACCACCACTCCTACGGGGgCGTCGTGTAGTAGTTTCCCCTGAACTTCGTCG TCCTTAAGAAGGACCC ™" 

-insert pLMl 




-ORF pLM1 



TC GPK£ £ V TL RVVVR riP PQ HI IKGDLKQQEFFLG 

CTGrAGCAAGGTCAGTGGAAAAGTTGACTGGAAGATGCTGGATGAAGCTGTTTTCCAAGTGTTCA AGGACTATATTTCTAAAATGGACCCAGCCTCTACC 
GACATCG TTCCAGTCACCT fTTC AACTGACCTTCTACGACCTACTTCGACAAAAGGTTCACAAGTTCCTGATA TaAAGATTTTACCTGCGTCGGAGATGG ^ 




-ORF pL.M1 



C S K V S G K V D w K M L D £ A V F Q V F KDY ISKMO^AST 

CTGGGACTAAGCACTGAGTCCATCCATGGCTACAGCATCAGCCACGTGAAACGAGTGTTGGArGCAGAGCCCCCCGAGATGCC TCCTTCCCGTCGAGGTG 
GACCC ^A7TCGTGACrCA3GTAGGTACCGATGTCGTAGTCGGTGCACTTT3CTCACAACCrACGTCTCGGGGGGCTCTACGGAGGAACGGCAGCTCCA: 




L G I 



-ORF pLMl 



T £ S I H GYSISHVKRVLOAEPPEMPPCRRG 



TCAATAACATATCAGTCTC:CTCAAAGGTCTGAAGGAGAAATGCGTCGACA:CCrGGTGTTCGAGACGCrGATCCCCAAGCCGATGATGCAGCAC TACA7 
AGTTATTG7ATAGfCAGAGGGAGTTTCC AGACTTCCTCTTTACGCAGC TGTCGGACCACAAGCTCTGCGACTAGGG GTTCGGC TACTACGfCGTGATGTA 

-insert pLMl 
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fig 34 pLM4 (1 > 1QQ70) Site and Sequence 



Page * 



A^SCCrcCTGCTGAAGCACCGGCGCCTCCTCCTCTCGGGCCCCAGCGGCACGGGCAAGACCTACCTGACCAATCGCTTGGCC GAG TACC TGG TGGAGCC: 
T "CGGAGGACGACTTCGTGGCCGCGGAGCAGGAGAGCCCGGGGTCGCCGTGCCCGTTC TGGATGGAC TGGTTAGCGAACCGGCTCATGGACC ACCTCGCC: 



-ORF pLMl 



3ULLKHRRLVLSGPSGTGKT 



YLTNRLAEYUVEft 



TC ^GGCCGTGAGG TC AC AG AGGGC ATC GTCAGC AC CTTCAACATGC AC CAGCAGTCTTGCAAGG A TCTGC A AC TGTAT CTTTCCAACCTAGCCAACCAGA 

AG ACCGGCACTCCAGTGTC TCCCGTAGC AGTCGTGGAAGTTGTACG TGGTCGTCAGAACG TTCCTAGACGTTGACATAGAAAGGTTGGATCGGTTGG'T'-*' ^ 



-ORF pLMl 



SGR EVTEG IV STFNMHOQSCKOU QLY 



L S N L A N Q 



TAGACCGGGAAACAGGAATTGGGGATGTGCCCCTGGrGArTCTATTGGATGACCTGAGTGAAGCAGGCTCCATCAGTGAGTTGGTCA 

atctggccctttgtccttaacccctacacggggaccactaagataacctac tggactcacttcgtccgaggtagtcactcaaccagttaccccgggagt^ <560<: 



— — ORF pi Mi 

1 °. R E T G t G 0 V P L y I L L D D L S E A G S I S E L V N G A L T 

c tgcaagtatcataaa ^g^ccctatattataggtaccaccaatcagcctgtaaaaatgacacccaaccatggcttgcacttgagcttcaggat gttgacc 
gacgttcatagta tttacagggatataatatccatggtggttagtcggacatttttactgtg ggttggtaccgaacgtgaac tcgaagtcctacaact^:; 5?C ' : 

-insert pLM1 



— ORFpLMl 1 

j * Y . H K C P Y 1 ; G T T N Q P V KMTPNHGLHLSFPMLT 



<""c tccaacaacg tgcagccagccaatggcttcc tggttcgttacc tgaggaggaagctggtagagtc agacagcgac at/ca atgcc aacaaggaagagc 

AA3AGGT TGTTGCACCTCGGTCGGTTACCGAaGGACCAAGCAATGGACTCCTCCTTCGACCA TCTC AGTCTGTCGC TGTAGT TACGGTTGTTCCTTCTC^ ^ 



— insert pLM1 



SNNVEPANGF 



ORFpLMl " 

L v , R Y LRRKLVESOS D I NANJCEE 



ATCGGCCC TTGCT TC T7 



TTCGGGTGC TCGACTGGGTACCC AAGC TGTGGTA TC ATC TCCACACC TTCCTTGAGAAGCACAGCACC TC A GACTTCCTC 

acgaagcccacg agctgacccatgggttcgacaccatag tagaggtgtggaagga^ctcj "tcgtgtcgtggag tctgaaggagtagccgggaacgaaga! 

•insert pLMl 



"ORF pLM1 



LLRVLDV VPKL VYHLHTFLEKHSTSOFL 



! G P C F F 
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fig 34 pLM4 (1 > 10070) Site and Sequence 



Page ft> 



TC TGTCGTGTCCC ATTGGCATTGAGGACTTCCGGACC TGGTTCATTGACC TGTGGAAC AACTCTATCATTCCC TATCTAC AGGAA GGAGCC AAG3A T«Vj'' 
AGACAGCACAWAACCGrAACTCCTGA A GGCCTGGACCAAGTAACTGGACACCTTGrTGAGATAGTAAGGGATAGAyGTCC TTCCK ^ 



«-SCP IGIEOFRT 



-ORFpLMl 



VFIDLWNNSIIP 



YLQE.GAKOG 



ArAAAGGTCCATGGACAGAAAGCTGCTTGGGAGGACCCAGTGGAATGGGTCCGGGACACACTTCCCTGGCCATCAGCCCAACAAGACCAATCAAAfir 



TATTTCCAGGTACCTGTCTTTCGACGAACCCTCCTGGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCGGTAGTCGGGrTGTTr 



TG" 



TGGTTAG TTTC GACA 




1KVHGQKAAVED 



PVEVVROTLPVPs 



AQQOQSK'L 



ACCACCTGCCCCC 



CCACCCACCGTGGGCCCT CACAGCArTGCCTCACCTCCCGAGGATAGGACAGTCAAAGACAGCACCC CAAGTTCTCTGGACTCAGATC^ 
TGGTGGACGGGGGTGGGTGGCACCCGGGAGTGTCGTAAC^^ »OT 



YHLPPPTVGPHS 



1 A 5 P P E D R J V <OSTPSSLDSDP 



TC 



rGATGGCCATGCTGCTGAAACTTCAAGA.GCT GCCAACTACATTGAGTCTCCAGATCGAGAAACCATCCTGGACCCC AACCTTCAGGCAAr.rTTT., 

a^ctaccggtacgacgactttgaagttcttcgacggttgatgtaactcIgaggtctagctctttggtaggacctggggItggaagtcc^tt^^ 



-insert pLM1 



-ORF pLM1 



L M A M L l * L Q E A A N Y I E S P 0 R E T ! L 



0 P N L Q A r L 



GGQTTCGGCAATCACTGTCACCCCCGGACAGCAGA ACGCTGGCATCAGCTATCTTAGCTCCTCCTCTCCCCTCTCCTCT TTCAGAGCA'" TGGCT~Tf'f'&" 




C-:CCAGGAGGAGAACAGGAGGGAGGAGGAGATGAA A GAGGAGGGACAGGTTC T T G GTGCTGT A CCTTTGA G . A C TTCC T AGGAAGGAATGGTG Ga( -.T,.;- 
G*3GG TCC TCCTC TTGTCCTCCCTCC TCCTCTACTT TC TCCTCCCTGTCCAAGAA CCACGACA TGGAAAC TCTTGAAGflATrr TTrrT t.». r/~ a rrrr- a/-i-!- 



v C C T F E N F L G R N G G V A 

GTTTGGGA,CTTGTGCCCCCTAAACACA T TTACT GGCCTCCTCTAATGACTTTGGGGAAAAGATGATTCTGGGTC TTTCCCTTGACTTCTTr.Tr T r..T7 
CA aA CCCTTGAACACGGG C GAT T TGTGTAAATGACCGG AGGA G ATTACTGAAACCCCTT; T CTACTAAGACCCAGA A A GGGAACTGAA G , a r, A ^ 



F G N L C P L N 



■tnsert pLM1 



T F T G L L 



LVGKODSGSFP 



L L V $ I 
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ftq 34 pLM4 (1 > 10070) Site and Sequence 



Page H 



ACAAACTCCTGGGC 



TTTCTGGGGAGGGGTTCAGAAAACATCAAAACACTGCAGCAGTTCCCCGGAATTCAGCTTGGACTTAACCAGGCTGAAC 



TTGCTCA 



tgtttgaggacccgaaagacccctccccaagtcttttgtagttttgtgacgtcgtcaaggggccttaag tcgaacctgaattggtccgacttgaacgagt 



-insert pLM1 



T N S V A F V G G V Q K T S K H C S S S P £ F 5 L 0 L T R L N L L 

AAAGAAGCCGAATTCC AGCACAC ^^CGGCCGTTACTAGTTCTAGATAACTGATCATAAT*CAGCCATACCACArTTGTA GAGGTTTTACTTGCTTTAAAM 
TTTCTTCGGCTTAAGGTCGTGTGACCCCCGG^^ 

> , 

-insert pLMl -J 

* R SftIPAHVRPLLVLON.S 



sec. 



S A I P H L 



* F V L L 



AACCTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAAT 

rTGGAGGGTGTGGAGGGGGACTTGGACTTTGTATTTTACrTACGTTAACAACAACAATTGAACAAATAACGTCGAATATTACCAATG 



T S H T S P 



TTTATTTCGTTAT 



SSCC 



N I K 



MQLLLLTCLLQL I 



M V T N K A I 



GCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAG 



TTGTGGTTTGTCCAAACTCATCAATGTATCTTAACGCGTAAATTGTAAGCGTTA 



CGTAGTGTTTAAAGTGTTTATTTCGTAAAAAAAGTGACGTAAGATCAACACCAAACAGGTTTGAGTAGTTACATAGAATiGCGCATTTAACATTCGCAAT 

I ,„ 

A S Q . 1 S Q 1 K H F F H C ILVVVCPNSSMYLNA 



700C 



I V S V 



AAAATCCCTTATAAATCAAAAGAATAGAC 



ATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGA AATCGGC 

tataaaacaattttaagcgcaatttaaaaacaatttagtcgagtaaaaaattggttatccggctttagccgttttagggaatatttagttttcttatctg 

i t on 

NILLKFALNFC 



7ic: 



! S S F F N Q 



aeigkipykske 



cgagatagggttgagtgttgttccagtttg gaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatca^ 
gctctatcccaactcacaacaaggtcaaa ccttgttctcaggtgataatItcttgca cctgaggttgcagtttcccgctItttggcaga 



E IGLSVVPVWN 



^SPLLKNVDSNV 



KGRKTVYQGO 



CGCCCAC TACGTGAACCATCACCCTAATCAAGTTTTTTGGGG TCGAGG TGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTT AGAGC TT 

ccgggtgatgcacttggtagtgggattagttcaaaaaaccccagctccacggcatttcgIgatttagccWgggatttccctcgggggc^aa 



GPLREPSP 



rrroTr 



s , s F L G s RCRKALNRNPKGSPRFRA 



gacggggaaagccggcgaacgtggcgag aaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgt aa^^ 

CrGC ^TTTCGGCCGCTTGCACCGCTC^ ^ 



* G K p A N varkegkkakgagar 



ALASVAVTURVT 



CACCACACCCGCCG CGCTTAATGCGCCGCTACAGGGCGCGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTT ATTrTK 
GTGGTGTGGGCGGC^GAATTACGCGGCGATGTCCCGCGCAGrCCACCGTGAAAAGCCCCTTTACACGCGCCT TGGGGATAAAf aa^Ta^aa ah attta^ 



T TPAALNAP 



LQGASGGTFRGNVR 



G T P I C L 
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fig 34 pLM4 (1 > 10070) Site and Sequence Pa 98 l V 

' "rTCAAATATGTA TCCGCrCATGAGACAATAACCCTGArAAATGCTTCAATAATATTGAAAAAGGAAGAGTCCTGAGGCGGAAAGAACCAGC TGTGGAA 

GTAAGTTTATACATAGGCGAGTACTCTGTTATTGGGACTATTTACGAAGTTATTATAACTTTrTCCTTCTCAGGACTCCGCCrTTCTTGGrCGACACCTT W * 



M S N H Y P L M R Q , p , , M L Q ■ Y . K R K S P E A E R T S 



C G 



TGTG 



TGTCAGTTAGGGTG TGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGrATGCAAAGCArGCATCTCAATTAGTCA GCAACCAGGTGTGGAAAr.Tr. 

ACACCTTTCAG3 



ACACACAGTCAATCCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTCCA' --- 



" C V 5 • G V . E S P Q A P Q Q A E V C K A C IS ISQQPGvesP 
CCAGGCTCCCCAG CAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATC CCGCCCCTAACTCCG" 

ggtccgaggggtcgtccgtcttcatacgtttcgtacgtagagttaatcagtcgttggtaIcagggcggggattgaggcgggtagggcggggattgaggc 1 . ?5i ' : 
0 A p ° 0 A e v c « ; c I S [ S 0 Q P . S R p . L R p s R p L p 

CCAGTTCCGCCCATTCTCCGCCCCATGGCT GACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATT CCAGAAGTAGTGAG" 

ggtcaaggcgggtaagaggcggggtaccgactgattaaaaaaaataaatacgtctccggctccggcggagccggagactcgataaggtcItcatcact.-.- ?9a 

" V P P I L R P H A 0 . F F L F M 0 R P R P P R p L s Y s R s $ £ 

aggcttttttggagg cctaggcttttgcaaagatcgatcaagagacaggatgaggatcgtttcgcatgattgaacaagatggatt gcacgcaggttctcc 
tccgaaaaaacctccggatccgaaaacgtttctagctagttctctgtcctactcctagcaaagcgtactaacttgttc tacctaacgtgcgtccaagag^ ^ 

r 



EAFLEA.AF 



-KarvNeo " 



A . K 1 0 Q E T G GSFRM IEQDGLH 



A G S P 



GGCCGCTTGGGTGGAGAGGCTATTCGGCTATG ACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCG CAGGGGCGCCrG 
CCGGC6AACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTACGGC6GCACAAGGCCGACAGTCGCGTCCCCGCGGGC *"* 



AAVVERLFG Y D V 



A , Q Q T 1 G C S DAAVFRLSAQGRP 



GTTCTTTTTGTCAAGACCGACCTGTCCGGTGC CC TGAATGAACTGCAAGACGAGGCACCGCGGCTATCGTGGC TGGCCACG ACGGGCGTTCC TTGCGCA5 
CAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTTCTGCTCCGTCGCGCCGATAGC ACCGACCGGTGCTGCCCGcTAGGTArnrfTTr »* 



-Kan/Neo 

L F V K TOLSGALN 



EI-OD E AARLSWLATTGVPCA 



CTGTGCTCGACGTTGTCACrGAAGCG6GAAGGG ACTGGcrGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCT CACCTTGCTCCTr,crr,Ar,^ 
GACACGAGCTGCAACAGTGACTTCGCCCTTCCCTGACCGACGATAACCCGCrTCACGGCCCCGTCCTAG^GACAGTAG^TGGAACGAGGArn.rrrT; *** 



-Kan/Neo 



A V L ° V V T . E * G » 0 V L L LGEVPGQDLLSSH 



L A P A E t 



Af CCATCATGGCTGATGCAATGCGGCGGC T GCATACGCTTGATCCGGC TACCTGCCCATTCGAC CAC CAAGCGAAAC ATCGCATCGAGCGAGCACGT 

3TTCGCTTTGTAGCGTAGC TCGCTCGTGC^ 



V S 1 ■ " A 0 A " R * L H T L 0 P A T C P F D H Q A K 



H R I £ R A P 



ACTCGGATGGAAGCCGGTCTTGTCGATCAGGA TGATC TGGACGAAGAGTA 



TCAGGGGC TCGCGCCAGCCGAAC TGTTCGCCAGGC TC AAGGCGAGC ATGC 



tgagcctaccttcggccagaacagctagtcctax7a^ctgcttctcgtagtccccgagcgcggtcggcttgac sea 

A , E LFARLKASM 



r R H £ A G L VOOOOLOEEHOGLAP 
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CCGACGGCGAGGATCTCGTCGTGACCCArGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATrCATCGACTG 



GGCTGCCGCTCCTAGAGCAGCACTGGGTACCGCTACGGACGAACGGCTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGCTGc- 



TGGCCGGC- 



ACCGGCCGA 



S6C» 



P D G E 0 L V V THGDACLPNIMVENGRF 



SGFIQCGRL 



GGGTGTGGCGGACCGCTATCAGG ACATAGCGTTGGCTACCCGTGATAT 
CCCACACCGCCTGGCGATAGTCC 



TGCTGAAGAGCT 



TGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGG- 



TGTATCGCAACCGATGGGCACTATAACGACTTCTCGAACCGCCGCTTACCCGACTGG 



CGAAGGAGCACGAAATGCCA 



-Kan/Neo 



G V A D R Y Q p , A L A T R D I A E ELGGEVAORFLVL Y 



ATCGCCGCTCCCGAT TCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTC TGAGCGGGACTC TGGGGTTCGAAATGACCGA CCAAGC GACGC-" 
TAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAA CTGCTCAAGAAGAclcGCCCTGAGACCCCAAGCTTTACTGGCTGGTTCGCT G r f ;' 



■ — -i.. — Kan/Nao — 

IAAPOSQRIAF 



Y R LLOEFF.AGLVGS 



P T K R R 



CAACC 



TGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGG 



GTTGGACGGTAGTGCTCTAAAGC TAAGG7 



AATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGC 



3TGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCG 
P F C H H E 1 S j P , P p P S M K G VASESFSGTPAG 



S S S A 



ggggatctcatgctggagttcttcgcccacc ctagggggaggctaactgaaacacggaaggagacaataccggaaggaacccgcgct atgacggcaataa 

CCCCrAGAGTACGACCTCAAGAAGCGGGTGGGATCCCCCTCCGATTGACTTTGTGCCTTCCTCTGTTAT^ *** 

L K H G R RQYRKEPAL.RO. 



GISCVSSSPTL 



G G G 



AAAGACAGAATAAAACGCACGGTGTTGGGTCGT 7 TGTTC ATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCC CACCGAG ACCCCATT^ 

tttctgtcttattttgcgtgccacaacccagcaaacaagtatttgcgccccaagccagggtcccgaccgIgagacagctatggggtggctctggggtaa' * ,k 

" 0 " ' * * T V L G R L F J " * G F G P R A G T L S ■ P H R 0 P , 
GGGCCAATACGCCCGCGTTTCTTCCTTirCCCCA CCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGG GGCGGCAGGCCCTGr- 

cccggttatgcgggcgcaaagaaggaaaaggggtggggiggggggttcaagcccacttccgggtcccgagcgtcggttgcagccccgccgtccgggacg' ? - C ' : 

G A MTPAF t,PFPf^p ^^QV^VK AQGSQPTSGRQALP 

atagcctcaggttactcatatatactttagat t^atttaaaacttcatttttaatttaaaaggatctaggtgaagat cctttttgataatctcatgacc, 

rATCGGAGTCCAATGAGTATATATGAAATCTAACTAAATTTTGAAGTAAAAATTAAATTTTCCTAGATCCACTTCTAGGAAAAACTATTAGAGTAC TGGT 
• P ° V T H ' U » t I • N F I F N L K G S R 



» S F L I I S 



TAATCTGCTj 



AAATCCCTTAACGTGAGTTrTCGTTCCAC TGAGCGTC AGACCCCGTAGAAAAGATCAAAGGA TCTTCTTGAGA TC CTTTTTTTCTGCGCG 
TTTAGGGAATTGCACTCAAAAGCAAGGrGACTCGCAGTCTGGGGC ArCTTTTCTAGTTTCCTAGAAGAACTCTAGGA AAAAAAGACGCGCATTAGACGA" 

« S L N V S F R S T £ R 0 T P 



mo: 



KRSKOLLE ilffc 



S A 



CTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTrTCCr^Ann 



GAACGrrrGTTTTrTTGGTGGCGArGGTCGCCACCAAACAAACGGCCTAGrTCTCGATGGTTGAGAAA A lr,r.r 



TAACTGGC fTCAGC AGAGCGC AH 



^CKQKNHR y 0 R w F v C R IKS 



V 0 L F F R R 



L A 5 A E R R 



WO 98/24810 
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xrACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCC 



TACATACC TCGC TC TGC TAATCC TGTTACCA3 



TArGGTTTATGACAGGAAGATCACATCGGCATCAATCCGGTGGTGAAGTTCTTGAGACATCGTGGCGGATGTATGGAGcLr^r; 



ATTAGGACAATGGTC 



-pUCoo . 



Y Q t L S F 



C S R S 



A T T S R T L 



rCGCTGCTGCCACTCGCGATAACTCGTGTCTTACCGGGTTGGACI 



HRLHTSLC 



S C Y Q 



:tcaagacgatagttaccggataaggcgcagcggtcg ggc TG AACGGGGGG tt cg tg 

ACCGACGACGGTCACC GCTATl 3 > 0 ^c,^, AC CTG^CTGCTATCAATGGCCTATTCC^CGTCGCCAGCCCGA CTTGCCCCCCAAnr^ 970 

-puuoo ! 



V L L P V A 1 S R v , L P G V T Q ODSYR IRRSGR 



A E R G V R 



CACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAA GCGCC 

TGGATGTGGCTTGACTCTATGGATGTCGCACTCGATACTCTTTrnrr.n 



GTG TG TCGGG TCGAACC TCGCTTGC 



ACGCTTCCCGAAGGGA GAAAGGCGGAC 
TGCGAAGGGCTTCCCTCTTTCCGCCTC 



9eo. 



AHSPAVSERPTPN 



0 T Y 



VSY E KAPRFPKGERRT 



AGG 



TATCCGGTAAGCGGCAGGGTCGGAACAGGA GAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAG TCCTGTCGGGTTTCGCCACC 
TCCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCG^GGTCCCCcWTGCGGACC^AGAAATATCAGGACAGCCCAAAnP^ **X 



-pUCttl 



G I R 



A A G S E QESARGSFQGE 



T P G IF IVLSGF 



A T 



TCTGACTrGAGCGTCGATTTTTGTGATGCTCGTC AGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCT TTTTACGGTTCC TGGCf TTTTfir Tft 
AGACTGAACTCGCAGCTAAAAACACTACGAGCAGTCCCCCCGCCTCGGATACCTTTTTGCGGTCGTTGCGCCGGAAAAATGCCAAGGACCGGAAAACnAr '°« 



-puc on 



SOLSVOFCOARO 



3 



G , GGA yGKT PAT RPF r G S V p F A 
GCCTTTTGCTCACA TGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCATGCAT ^ 

cggaaaacgagtgtacaagaaaggacgcaataggggactaagacacctattggcataatggcggtacgta 10070 

C l L L T C S f L R Y P L I L V [ TVLPPC I 
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4*0 
$99 



TM 709 



eonde«aa 27 novemoer 1997 i$:4fi 

ng 15 pWPft Map (i > t»4i) $** an d Sequence 1 
EnxyrTUB : aji i*« ♦mymes (Nq Riur) 

S^tirwa : CimUt f, Certain 3H— Onty. Standam G orotic Cgfla 

A™CCArGAmCGCCAA<KTTGCATGCCTG^^ 

CACATCCATTArCCCACCCCCdCTTTCTAACTGACTlTAATTTT'CAGT^ 209 
CTGTAT6ACTAGTTGTTG AGTGaTTTTTCaTTGAGA AAATATTAAAA GCAACATTAT7TACT77 CCTTATTTCCCCT A ACTTTGA TTTAC7TTTTCCATC 396 
AACTAGATCrrACAAAACTTCCAATACAATTCCATTTTCACATTACCCTCtt 
ACTTTCCACAiWTtrrCAACATCCAGGCTTCAGACTCCACAGTCAAGAATATCGAAA^ 
UGAMAGAAGATWTWAMTCTA(mTrmGCAMAm^ 
AAATTTTCaaATACAAArCrAtUAAACTAr 

AGTGGAAGWGCGCAACrrCTATACTGGA^TAATGATCT<^CAAAm<n-G |ft0 

CACTAGTTTC>CAACCrn"CCTTT7TCTATGAAAAAGTAAAAAA^ ^ 

TCXUATTTACAAATCGCCTCCCCTrKCCGAMACrrGCCCACCAAAATOU ^ 

AGOTAACTAMTTGACAGGGACAOnTGTCCCTC^^ UM 

TTCTTCCATTITGCrrATAAACATTTGTCT'CrrGGAAGGAAACTACACGGGGAGACGGT l2W 

T<^TGAATAAA(UTAGAGCCGATGACACTGGCTGGTAGTAG7ATGAGTGTA^AATTGCrTT^ 13ee 
^^^^^^^^^^^^^^^^^^CCTCITrCCTATTCCGCCACTCATTCTCCGn ACC ACAAACT GGAATACAT7TTACT ACTATTCAAGCC 1409 

ATTTATTTTGATATTTAATTTTGT<ZAAT7AGGGATAAACACGACTTTTAAAA 1£6 q 
T^CAAAAAATCAA TAAA TA Tf CCCTAA C AAA TTGTA T G GCT AAAATTTT ATTTCTACTGTT G AC AA T ATCTTT A TATGT A TCACTGTTTTCCATCTCAAA 1G0© 

ACCTrGAArcCCCCMGTTArAGGAAGCTCCGTCTCACATTrCCCATGCTATGAJiTCGCTACrCAGCAC itto 
TTATTGGCCACCCGTMTAAAGTGCAAGCAGTTAGAATT^ 

T7TTCCATGCTrT7T5GCCCA77AAAAAAC"nTCTCA CCTCTTCATCCATCTCACTCGTATCATAAAAAGTATAGCAAAAGCC CGACTCTACTTTTTAAG 1S09 
AGAAGGAGAT AO G A GCC AC ATG GC GTGTG AC C CTTTTC AT CTCCTCCGTTCGGTCT CAAA tTCAC GCTCAT AC TAACTCTTCAAAT A CCC ATAG ACCTC 2060 

CTTGT7TrCTTCTTC<nTTTGACTCCCGCCTArTTT^ aC8 

AAAAACGAGGCAATTT AAAA A r ATTAAAA nAATGAGGTTCTAGATCTAGATTTGCAAAAGAA^^ 2209 

ttctatttajwsgttttc^gatgtt^ 23ee 

ATCTAAGCCTGAACT ATAGCCTTATTCTAGATCTTAGTTGCKATAAGCT 24 00 

GCrTfiCrTCAGTCTAATCCAGACTAGArTTCCAAGAWGTTTTCMTT^ 2SC0 

AAAATCGTTATCCCTTTCTCTCACACTTTCW 2609 

ACna7CAAATAATACAA^rrCAnCCGTCCGTCWGCCGrrCGACTGGCA>TAATAAT(^ 27W 

tcccatccttccggcttctttttagaaattatattatttcagaatcatcatw 2808 

AAAACCTTCTA^CCAC^CCCAGaAOTTCGTGTTGCrACAACTACAJUAArCGCAAGCT ^ 

KTTGCTTCTCTGAAGACTATTGGAGCAAaACAAGAGCC 3de0 



WO 98/24810 60/270 PCT/EP97/06956 



dendftfdtg, 27 nevonfeer 1097 *6:46 

fin 3S ONPB M ap ft > 1264H Stta and Sjfljgncg _ **9« ? 

aTCrrCCTCATC^TA^CCAC^CaACGACAAACCCMCMCOTGCCTCAACM 31W 
CAACCCGCCG»CCACTrAAGCTGG<iAACTGCCACCTCTATGTCCAACCrrT<7rACG(r^ 3M0 
^TTCTACAAAATAMTTAAAAATAAGATrrmcaaCAn ^ 
*™CAGATATA<^AMGA>ATAAAAAAT^^ ^ 
AACAAATTTTAAAACCCTAffTTTTCCGAACCTCTCCCCCrCATCTA ^ 
CCAACaAaAATAAAAATWTUGACAATTCGWTTGTCrCCUmTCTT 3^ 
TTCTC^CCCCCCATAiUCACTCTTCCCGGAAAAATCTTGCAACC<UACTCATAT^ 37w 
CAAA6ACTTAAA6CAATTTCTCAGCTCTTCTTC 3&Q0 
TTTGT'GrGAGATG CACTTTTT GAAAAATT Ai CTTTAC GTTTTCA GTTT CT A GT ATTTATTTTTJTCATA TAAATTA CAGCTTCTTAGA CCT5CTATATTT 39W 
TTTAAAACTTCCT ACTGAA^TATACGAGATTCTTTTGACrTTCCGG^TTGTCTTATGCCTT 4^ 
G A A AA AAAMaCTGCATCTTC(rrrrmACATAGTAATTrCCAGCCAAA^ 41TO 
AACGATGCTCAAAGAGCAGTGAAGAAGAGTCCGGATACGCTGGATTCAACAGCACCrrCCCCAACGTCATCATCGACGGM 4200 
CACATCTTCCAAGG I ilui 1 u 1 1 1 AGGAGAACTS • fTTTTTTTGTn I CCTGACCT7 CACATACTCTCCCATCTTT 4TAAAAGTGAGGTCTCTGGGACAC 4300 
CTGCGATAAAATGTGAATCCGCCCA 1 1 iui I GGTACAAAAAACTTTGCAGA SCACCT GCTTTATACATTTTT AGGATAAATGTCATACGGTATTTGTCAA 4490 
ACCCAAAC hit 1 AAA TTTT ATT7TC A G A TCAAAAAT TGATGTT AAAAGTTT AA GA TA TTTAC GAAAAAA TCTTTT A CTT AAAA CTTT7TTAT ATCCATA 4S00 
AAATTTTAGAACATTCAGATAGGAGTTCCGTCCCTAAAACTT^ 4*39 
T7TTTCTGCAACTGAATTTTTAGA C A TGAAACTTTMCTTTCAATrTATCCCATTTGAAACCGT CC CCTT CTATAAAAOTCAAAATTTTCA GACTTCAA 47*0 
CGTCAGACGAA^GTCTCCGTC^TCAGACGATCTTACTCTrAACGCCTCCATCGTGACAGCTATCACAC 4400 

tattatcaagvagcctcttcaggtcagta i nrrrcn laaCTAGAGccTTcnacAAGmcGcaAAAmrATTAAcrrGGmAGAGGcrcGc 4933 

A^GCCAT7GATCAAGCATGGGCTAAACTGGGCCGGCCTGAACCAT(n'ACATArrCTT7GGCCCGAGr ^ 
AAGTCG GACTAG GCAAAA GT G C AAAAA TG C AAAA TA TCTTGAAATTCAAT A C G CTTTC C C TCTTTC CA TTCTT C CTTTTTT G7 CCTCTTTTGTTGCA G AT 5160 
TTTCCCTTTTTTACA iTTTAAGA JTTTG A TACACTTTAATGTCTG GCTT CGCCTTCCTAAGAGCCTTC CTATA TTTTC GAAAAA rAATCAATTTTTAGGA 520© 
AAAACCAACACTGGCAGTGAXfcGGAGTGAAAAGCACAGCGAAAAAAGATCCACCT 5W0 
ACTCCAATTATGC^CATAAGA^GrrGACAAATGCTACGTTTATTTCT 5439 
AGTAATTTTTGCjATTATTTAAACTTTGTG J5e0 
TAAGrniTTTTCTGAAAGTTGTTTAAA^AATTGAGTTAAAAA^ 56Cd 
TTTCCAAATATGTCTAATACTAAGATTrGTTTGTGAATTrACAAAACATATTTfAA^ S7C8 
AAAMTCAGAAA^GCGA^ATTrAAnCCAAAAAAJUTATmGAA^aCACAAA fAAAGCT AC1TTTCAAA AAATCAACAAAAAAAATATCAAAAGAT 5800 
TCATATTTTtUCJUATACJiGACATAGTCAAAAACTArCAAAAAAT^^ 59eo 
TTGAGAAAAArrCCACAAATTGAAy^AAAATTCCTTTGAGGAAAA TTTAAAAATTfTAAATGTCTOATri OGAAACCAAGCATTTrCCGACPTTCGGC 60*0 
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*^agatcagaagaaattggaccaactacccacatccattatgccacccgcggtttctaaattaccctcgcca:gtctcgccacgtc4gcaaccg 



Page 3 



GCTTCA 



TT7TC TAGTC ^^^^^^^ A ^CTCGTTGATGGGTGTAGGTAATACGGTGGGCGCCAAAGATTTA ATGGGAGCGGTGCACAGCGGTGCAGTCGTTuSrn^^r. " 



-C.e.uncS3 xba 



K 0 Q ^ K L EOLPTSIMPPAVSK 



LPSPRVATSAT 



A 3 



GCAACTAACCCAAATTCCAACTTT^ 

CGTTGATTGGGTTTAAGGTTGAAAGGTGTTTACAGTTGTAGGTCCGAAGTCTGAGGTG TCAG "TTCTTATAGCTTTTAACTAAG TAGTTTCTAACCATAiiT '** 



-C.e.unc53 xba 



A T N P A| s NFPQMSTSRLQTPQSA 



S K [ D S S K I G I 



AGCCAAAGACGTC TGGACTTAAACCACCCTCATCATCAACCACTTCATCAAATAATACAAATTC ATTCCGTCCGTCGAGCCGTTCGAGTGGC 
TCGGTTTCTGCAGACCTGAATTTGGTGGGAGTAG " ~" 



AATAATAA 




> P K T S G L K P P S s S T T S S M N T W S F R P 3 S R S S G N N N 




2 I CC 



CCTTGGGATCCACCG GATCTAGATAACTGATCATAATCAGCCATACCACArTTGTAGAGGTTTTACTTGCTTTAAA^ACCTCCC ACACCTCCC^TGAA 
GGAACCC TaGGTCGCC TAGATC T ATTGACT AGTATTAGTCGGTATGGTGTAAACATC TCC AAAA TGAACCiAAA TT T TT Tnr.rtK.tr rr. rr.r \rr-rr An- 1 



0. 



-eGFPC.e.unc53xba 



voppolon 



S A l P H L 



R F Y L L 



K T S H T S P 



CCrGAAACATAAAArGAATGCAATTGTTGTTGTT AACTTGTTTATTGCAGCrTATAArGGTTACAAATAAAGCAATAGCATCA CAAATTTCACAAATAA, 
G.ACTTTGTATrTTACTTACGTTAACAACAACAATTGAACAAATAACGTCGAATATTACCAATGTTTATlTCGTTATCGlAGTGITTAAAGTGTTTATT; 
' ' " ' ' ■ " ° L 1 L L T C L L Q L I H V T N K A | A S Q I S 0 I , 

GC.-,rTTTTTTCACTGCArTCTAGTTGT G G TTTGTCCAAACTCATCAATGTArCTTAACGCGrAAATT G TAAGCGTT,ATATTTTGTTAAA ArT; 3 CGTTA 
CGTAAAAAAAGrGACGTAAGATCAACACCAAACAGGrTTGAGTAGTTACArAGAATTGCGCATTTAACArrCGCAATTAlAAAACAATTTTAAo-GCA^ 
" F F " C ' L V V V C P " S S h V L N A . , » 3 V H I L L K F * t 
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• AATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTArAAATCAAAAGAATAGACCGAGAr AGGGTTGASrGTTGTT: 
TTAAAAACAATTTAGTCGAGTAAAAAATrCGTTATCCGGCTTTAGCCGTTTTAGGGAATATTTAGTTTTCTTATCTGGCTCTATCCCAACTCACAACAAS 
■ N F C .• 1 S 5 F F N Q . A £ 1 G K [ P Y K S * £ . T E I G L S V V 

CJGrTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCC CACTACGTGAACCATCACC 
GTCAAACCTTGTTCTCAGGTGATAATTTCTT6CACCTGAGGTTGCA6TTTCCCGCTTTTTGGCAGATAGTCCCGCTACCGGGTGATGCACrTGGTA6TG3 ^ 
PV VNK SP LLK NVOSWVK GRK TVYQGO GPLREPSP 

CTAATCAAGTTTTTTGGGGTCGAGGT6CCGTAAA6CACTAAATCGGAACCCTAAAGGGAGCCCCC6ATTTAGAGCTTGACGG GGAAAGCCG6CGAACGTG 

GATTAGTTCAAAAAACCCCAGCTCCACGGCATTTCGrGATTTAGCCTTGGGATTTCCCTCGGGGGCTAAATCTCGAACTGCCCCTTTCGGCCGCTTGCAC ^ 
■ S S F L G S R C R K A L N R N P K G S P R F R A R G K P A N V 

GCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCG CTTAATG 
CGCTCTTTCCTTCCCITCTTTCGCTTTCCTCGCCCGCGATCCCGCGACCGTTCACATCGCCAGTGCGACGCGCATTGGTGGTGTGGGCGGCGCGAATTAC ^ 
ARK EG KKA K G AGA RAIASV/A VTL R VTTTPAALN 

CGCCGCTACAGGGCGCGTC AGGT66CACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATT C AAA TA TG TA TCCGC TCA T 
GCGGCGATGTCCCGCGCAGTCCACCGTGAAAAGCCCCTTTACACGCGCCTT6GGGATAAACAAATAAAAAGATTTATGTAAGTTTATACATAGGCGAGTA 
APLQGAS G6T FRGNVRG TP I CLFF . I H S N M Y P L M 

GAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTCCTGAGGCGGAAAGAACCAGCTGTGGAATGTGTGTCAGTTAGGGTGTGGAA 

ctctgttattgggactatttacgaagttattataactttttccttctcaggactccgcctttcttggtcgacaccttacacacagtcaatcccacacctt 
R ° p ; • w »• 0 . V . KRKS PEAER tscgmcvs . G V E 

AGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGC agg cagaas 

tcaggggtccgaggggtcgtccgtcttcatacgtttcgtacgtagagttaatcagtcgtIggtccacacctttcaggggIccgaggggtcgtccgtctk 31 a 
SPQ APQQAEV CKA CIS isqqpgv espqa p o q a e 

TATGCAAAGCA TGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCC 

atacgtticgtacgtasagtiaatcagtcgttggtatcagggcggggattgaggcgggtagggcggggattgaggcgggtcaaggcgggtaagaggcg'jG 32C " : 

V C K A C ' S . 1 S 0 ,0 P ■ S R P . L R P S R P . L R P V P p | L R F 

CATGGCTGACT AArTrTTTrTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCrrTTTTGGAGGCCTAGGCT 
GTACCGACrGATTAAAAAAAATAAATACGTCTCCGGCrcCGGCGGAGCCGGAGACTCGATAAGGTCTTCArCACTCCrCCGAAAAAACCTCCGGATCCGA * K>: 
" A °. • F F > F " 0 » P R P P R P I S Y S R S S £ E A F L E A A 

TTTGCAAAGATC GATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTAT 
AAACGTTTCTAGCTAGrTCTCTGTCCTACTCCTAGCAAAGCGTACTAACTTGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATA ^ 
ft * * DOE TG • G S F R M [ EODG LHA GSPAAVVERL 

TCGGCTATGACT GGGC ACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAG3GGCGCCCGGTTCTTTTTGTCAA5ACCGACCT 

agccgatactgacccgtgttgtctgttagccgacgagactacggcggcacaaggccgacagtcgcgtccccgcgggccaagaaaaac agttctggc TGtti "° : 

FGYOWAQ QT I GC SOAAVFRL SAQGRPVLF V K T 0 L 

GTCCGGTGCCCTGA ATGAACTGCAAGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGC7CGACGTTG TCAC TGAA 
CAGGCCACGGGACTTaCTTGACGTTCTGCTCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGaacGCGTCGACACGAjCTGC AACAG TGACT" X * 
JGALNE LOOEAaRLSV LATTGVPCAAWL0 7VTE 
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' G'-SGCAAGGG/tCTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGrCATCTCACCTrGCTCCTGCCGAGAAAGTATCCATCATGGCTGA ToCAA 

CGCCCrTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGACAGTAGACTGGAACGAGGACGGCTCTTTCATAGGTAGTACCGACTACGTT J7C ' 
A G ft 0 V L L LGE VPGQDLLSSHL A P a E K V S 1 M A 0 A 

f GCGGCG6C TGCATACGCTTGATCCGGC TACCTGCCCATTCG ACCACCAAGCGAAAC ATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCG GTCTT'jT 
ACoCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCrGGTGGTTCGCTTTG TAGCGTAGCTCGCTCGTGCATGAGCCTACCTTCGGCC AGAACA 
M R ft L H r L D P A T C P F 0 H Q A K H R I E R A ft T R h E A G L V 

CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGATC TC GTCGTG 

gctagtcctactagacctgcttctcgtagtccccgagcgcggtcggcttgacaagcggtccgagttccgctcgtacgggctgccgctccIagagcagcac 3X1 

DQODLD EEHQGLAPAE LFARLKA SMPOG E D L V V 

ACCCATGGCGA TGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAG G 
rG3GTACCGCTACGGACGAACGGCTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGCTGACACCGGCCGACCCACACCGCCTGGCGATA3T,-c ^ 
T " G ° A C L P N '. " v E " G R F S G F I 0 C G R I G V A 0 R Y Q 

ACATAGCGTTG GCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCG CAGCG 
TGTATCGCAACCGATGGGCACTATAACGACTTCTCGAACCGCCGCTTACCCGACTGGCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGC * ,W 
° ' A L A T R 0 1 A , E ^ »- G 6 E V A 0 R F L V L Y G t A A P 0 S Q R 

CATCGCCTTCTA TCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCG 
6TAGCGGAAGATAGCGGAAGAACTGCTCAAGAAGACTCGCCCTGAGACCCCAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGTAG TGCTCTAAAGC 
' A F Y " L > P E F F ■ A G L V 6 5 K PTKRRP TCHME I S 

ATrcCACCGCCGCCTTCTATGAAAGGTTGG GCTTCGGAATCGTTTTCCGGGACGCCGGCTGG ATGATCCTCCAGCGCGGGGATCTCATGCTGGAG T TC TT 
TAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCT-AAGAA ^ 
IPPPP SMKGWASESFSGTPAG ■ S SSAG ISCUSS 

CGCCCACCCTAGGGGGAGGCTAACTG AAACACGGAAGGAGACAATACCGGAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACG CACGGT 
GCGGGTGGGAICCCCCTCCGATTGACTTTGTGCCTTCCTCTGTTATGGCCTTCCTTGGGCGCGATACTGCCGTTATTTTTCTGTCTTATTTTGCGT3CCA ^ 
3 ! T L G 6 G • L ' H G R R ° ,V » . * E P A L . R Q , K D R I K R T V 

GTTGGGTCGTTTG rTCATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTT TCTT 
C AACCCAGCAAACAAG rATTTGCGCCCC AAGCCAGGG TCCCGACCG TGAGACAGC TATGGGG TGGC TCTGGGG TAACCCCGG FTA TGCGGGCGC AA AGAA 
L G " L F ' " A G F G p R A G T L S ■ P H R 0 P , G A N T P A F L 

CCTTTTCCCCACCCCACCCCCCAAGTTC GGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAGGCCCTGCCATAGCCTCAGGTTACT CATA7AT 

GGAAAAGGGGTGGGGTGGGGGGTTCAAGCCCACTTCCGGGTCCCGAGCGTCGGTTGCAGCCCCGCCGTCCGGGACGGTATCGGAGTCCAAfGAG TATATA 
" F P " P T " 0 V R V * * Q G S Q P T S G R 0 A L P . P 0 V T H I 

AC TTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAA3ATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCG 

tgaaatctaactaaattttgaagtaaaaattaaattttcctagatccact tctaggaaaaac fattagagtactggttttagggaattgcactc aaaagc '"^ 

" ' * L ' i 1 F ' F N l * 6 S R . R S F L , I S , p K s L „ v s p „ 

TTCCACTGAGCGTCAGACCCCGTaGAAAAG ATCaaagGATCTTC TTGAGATCCTTTTTTTCTGCGCGTAATCTGC TGC TTGC AAACAA AAAAACCAC"GC 

AA3GrGACTCGCAGTCTGGGGCATCTTTTCTAGTTTCCTAGAAGAACTCTAGGAAAAAAAGACGCGCATTAGACGACGAACGTTTGTTTTTTTGGT3G r ' : * * 
> T E R 0 T P . KRSKOlLE ILFF C A . S A A C K 0 K H „ p 
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TACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATA 



actgtccttc Tag: 



AT3GTCGCCACCAAACAAACGGCCTAGTTCTCGATGGTTGAGAAAAAGGCTTCCATTGACCGAAGTCGTCTCGCGTCTATGGTTTATGACAGGAAG * 
I ° R F V C R ' * S y g > F F R R . L A S A E R R y o , U s F ^ 



TCA 



GTAGCCGTAGTTAGGCCACCACTTCAAGAAC TCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGC CAGTGGCGAT^-: 
CATCGGCArCAATCCGGTGGTGAAGTTCTTGAGACATCGTGGCGGATGTArGGAGCGAGACGATTAGGACAATGGTCACCGACGACGGTCACCGCTATT-' 

C S " S • A T - T 5 * T L ■ H ; i h T s t c . s c r q y L L p v , ; 

rCGTGTCTTACCGGGTrCGACTCAAGACGA TAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCA CACAGCCCAGCTTGGAGCGAA 
AGCACAGAATGGCCCAACCTGAGTTCrGCTATCAATGGCCTATTCCGCG^GCCAGCCCGACrTGCC CCCCAAGCACGTGTGTCGGGTCGAACCTCGCT; *"* 
SVL PG V TQDQSYR * R R5GRASRG VRAHS PAVSE 

CGACCTACACCGAACTGAGATACCrACAGCGT GAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGT ATCCGGTAAGCGGCAGGGT 

gctggatgtggcttgactctatggatgtcgcactcgatactctttcgcggtgcgaagggcttccctcttIccgcctgtccataggccatIcgccgtccca 52C,: 

" " T " N • ° T Y S V S Y E < A P R F p k G E R R t G . R . a A G 

CGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGG GGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTG ACTTGAGCGTCGATTTTT, 

GCCrTGTCCTCTCGCGTGCTCCCTCGAAGGTCCCCCTTTGCGGACCATAGAAATATCAGGACAGCCCAA AGCGGTGGAGACTGAACTCGCAGCrAAAAA'' 
S E Q E S A R G S F Q G E r P G ! F , y L S G F A T s p , s , p p ^ 

TGArGCTCGTCAGGGGGGCGGAGCCTATGGAA AAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTrTTGCTGGCC TTTTGCTCACATGTTCTTTC 

actacgagcagtccccccgcctcggatacctttttgcggtcgttgcgccggaaaaatgccaaggaccgg^aacgaccggaaaacgagtgtacaagaaa' *"* 
^ 0 A R Q gg g A Y g k t p a t R p F y g s v p f a g l l l t c s f 

ctgcgttatcccctgattctgtggataaccgtattaccg ccatgcat 
gacscaataggggactaagacacctattggcataatggcggtacgta Sqq7 

■- R Y P L I LWITVLPPCI 
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A 



ge / 



fAG TTAT TAATAGTAATCAATTACGGGG TCATTAGTTCA TAGCCCATATA TGGAGTTCCGCGTTACA TAAC TTACGGfAAATGGCC 



CGCCTGGCTGACC: 



A TCAATAATTATCATTAGT TAATGCCCCAGTAATCAAGrATCGGGTATATACC TCAAGGCGC AATGTATTGAA TGCCATTTACCGGGCGGACCGAC TGGC 



L L I V I N Y G V 



S S 



PIYGVPRYITYGK 



V P A V L T 



CCCAACGACCCCCGCCC AT T^ A ^^^ A ^^ AA ^ ^ A ^Q^^^GTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAAT GGGTGGAGTATTTACGG y 
GGGTTGCTGGGGGCGGGTAACTGCAGTTATTACT GCATACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCC ArrTrATAAATrrr a 2 °° 



-pCMV 



AQRPPP t OVN NQvcSHS NANRDP p^ T S M G G V F T V 

aaactgcccacttg gcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggc ATTAT GCCCAGTh 
tttgacgggtgaaccgtcatgtagttcacatagtatacggttcatgcgggggataactgcagttactgccatttaccgggcgga tcat 3C 



NCPLGSTSSVSY 



-pCMV 



A K Y A P Y 



R Q 



MARLALCPV 



CATGACCTTATGGGACTTTCCTACTTGGCAGTACATC 



GTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGA 



TACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCA 



ATGGGCG TGGA 



M OLMGLSYLAVH 



L , R t S H R YYHGOAVLAVHOWAW 



TAGCGGTrTGACTC ACGGGGATTTCCAAGTCTCCACCCCATTGACGTC AATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCC AAAATGTCGTh 

atcgccaaactgagtgcccctaaaggttcagaggtggggIaactgcagtIaccctcaaacaaaaccgtggttttagttgccctga 500 



I A V 



L T G ' S * . S P P H ■ " 0 « E F V L A P K S T G L S K M S 

ACAACTCCGCCCCATTGACGCAAATGGGCGG TAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGA TCCGCTAGCGCTm 

tgttgaggcggggtaactgcgtttacccgccatccgcacatgccaccct ccagatatatIcgtctcgaccaaatcacttggcagtctaggcgatcgcgat 600 

> 



-pCMV 



0 L R P IOANGR 



A C , T V G <j L VKQSVFSEPSOPLAL 



cc 



GGTCGCCACC ArGGTGAGCAAGGGCGAGG AGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCA CAAGTTCAG'''G 

gg:cagcggtgg ^actcgttcccgctcctcgacaa^gccccacc^^ - 

ECPP - 

- " ' - T " " S K G £ E L F T 6 V V > ' L V E L Q G 0 V w G H K F S 

rGTCCGGCGAGGGCGAGGGCGATGCCACCTAC GGCAAGCrGACCCrGAAGTTCArCTGCACCACCGGCAAGCTGCCCGTGCCC TGGCCCACCCTCGTGAC 
ALAGGCCGCTCCCGCTCCCGCTACGGTGGA T GCCGTrCGACTGGGACTTC AAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGG TGGGAGc"acTg" ^ 

V S ° ' G ' G ° A T Y G « L T L K F I C T T G K L P V P y p T L y T 

caccctg^tacggcgtgcagtgcttcagccgc taccccgaccaca^ 




utygvocfsr 



YPDHMKOHOFFKSAH 



P E G Y V 0 E 
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Paget 



CGCACCATCTTCTTCAAGGACGACGGC AACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAG CTGAAGGGCATC" 
GCGTGGTAGAAGAAGTTCCTGCTGCCGTTGATGTTCTGGGCGCGGC TCCAC TTCAAGCTCCCGCTGTGGGACCACTTGGCGTAGC TCGAfTTrrrnT^i-:- 



* T 1 , F f * 0 0 G N Y K T R AEVKFE60TLVNR 



I B L K G I 



ACTTCAAGGAGGACGGC AACATCCTGGGGCACAAGC TGGAG 



TACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGi 



CAGAAGAACGGCATCAA 



TGA^CCTCCTGCCGTTGTAGGACCCCGTGTT CGACCTCATGTTGATGrTGTCGGTGTTGCAGAT ATAGTACCGGCTGTTCGTCTTC^nrr^^.T 



D F K Z 0 G N 



LG HKLEVNYNSH NVYI MADKQKNG 1>. 



GGTGAACTTCAAGATCCGCCACAACA 



TCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCG GCGACGGCCCCfiTfirTfir T" 
£g£I^£II£IAGG i GGTGTTG^^ 



^ N ^ * Ift HN IEOGSVQL A D H Y Q Q N T P I 



G 0 G P V L L 



CCCGACAACCACTACC 



GGGCTGTTGG 



TGAGCACCCAGTCCGCCCTGAGCAA AGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTfCGT GACCGCCGCCGGGA 
TGATGGACTCGTGGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTGTACCAGGACGACCTCAAGC ACTGGCGGcnarrr t 



PONHYLSTQSA 



LSKOPNEKROHM 



VULEFVTAAG 



rCACTCTCGGCATGGACGAGCTGTAC 



AAGTCCGGACTCAGATCTCGAGCTCAAGCTTCGAATTC T6CAG 



TCGA 



AGTGAGAGCCGTACCTGCTCGACATGTTCAGGCCTGAGTCTAGAGCTrfiAr, 



TAAGCTTGATATCGAATTCCTGCAGCC 



TTCGAAGCTTAAGACGTCAGCTATTCGAACTATAGCTTAAGGACGTCar, 



I T L G M D E L 



u. * * S G L » S * A . Q * S N S A V Q K L D t E F L Q P 

CCTGC ^^^^AGCCAGATGCTGGACCCAGAGTCCCAG A GAAAGAGGACAGTGCAGAATGTCC TGGATCTCCGGCAG AACC TGGAAGAGACCATf; Tmr*~ 
G GACGAGAAGTCGGTCTACGACCTGGGTCTCAGGGTC T C TTTCTCCTG ^ACGTCTTA CAGGACCTAGAGGCCGTCTTG GACCTTCTCTnnTAr Anr:r 7^ 




LLFSQMLDPES 



ORKRTVQNVLO 



LRONLEETMS 



CTGCGAGGGTCCCAGGTGACTCACAGCTCCCTGGAGA TGACC fGCTACGAC AGCG A TG A TGCCAACCC ACG CAGrr.Ti'rr rAr:rr Trrrr a .*/-/-*- - 

GACGCTCCCAGGGTCCACTGAGTGTCGAGGGACCTCTAC ^ GGACGATGCTGTCGCTA CT ACGGTTGGGTGCGTrr.r *r Ar.ffrrrff ».<•■<<?/- tti- .- ' . 




L R G S 0 V T 



H s S LEMTCYDSDO 



ANp RSVSSLSNRS 



CCCCTCTGTCATGGCGCTA TGGCCAGTCCAGTCCGrnnr 



— ~ - ' ' 1 « --TGC AGGCTGGTGACGCGCCCTCTGTGGGTGG GAGCTGCCGCTCGGAGGGGACGCCCGCCT:; 

GGjGAGACAGTACCGCGATACCGGTCAGGTCAGGCGCCGA CGTCCGACCAC ^GCGCGGG AGACACCCACCCTCGArr.r.rAAr TG^GGGCGGA: 




S => L 



3WRYG0SSPR 



L Q AGO A P S V G G 3CRSECTPAV 
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GraCATGCACGGCGAACGGGCCCAC TAC rCCCACACCATGCCCATGCGCAGCCCCAGC AAGC TCAGCCATATC ICCCGCCTGGAGCTGG 7CGA A ZC TG 
C ATGTACGTGCCGCTTGCCCGGG TGATGAGGGTGTGGTACGGGTACGCGTCGGGGTCG TrCGAGTCGGTATAGAGGGCGG ACC TCGACCaGCTTaG SGAL' 

-insert pLMl 



-ORF pLMl 



Y M H G E R A H Y S H T M P H R S P S K L S H 1 S R L E I V E S L 

1 -■■ * i i ..I. 

GACTCGGATGAGGTGGACCTCAAGTCCGGCTACATGAGCGACAGTGACCTCATGGGCAAGACCATGACGGAGGATGATGACATCA 

CTGAGCCTACTCCACCTGGAGTTCAGGCCGATGTACTCGCTGTCACTGGAGTACCCGTTC TGGTACTGCCTCC TACTACTGTAGTGATGGCCGACCCTAC 



-insert plMi 



-ORF pLMl 



0 S 0 . E V 0 L K ^ G, I M S D S 0 I h G K T M T E D D Q | T T G V 0 

AAAGCAGCTCCATCAGTAG |"GGACTCAGCGATGCCTCAGACAATCTCAGTTCAGAAGAATTCAATGCCAGCTCCTCAC TCAACTCCCTCCCAAG TA CTCC 
TTTCGTCGAGGTAGTCATCACCTGAGTCGCTACGGAG TC ^GTfAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGTGAGTTGAGGGAGGGTTCATGAGG 




ESSSISSGLSO 



S 0 N L SSEEFNASSSLNSLPSTP 



C AC TGCTTCTC GCAGGAAC TCAACAATAGTGC T AC GC AC AG AC TCAGAGAAGCGCTC AC TGGCAGAA AG TGGGCTGAGCTGG TTTAGTGAATCA GAGGAG 

gtgacgaagagcgtccttgagttgttatcacgatgcgtgtctgagtctcItcgcgagtgaccgtctttcacccgactcgaccaaatcacttagtctc^ 



2\Z\ 



■insert pLMl 



-ORF pLMl 



T A S R R N S T I VL RTQSE KRSL AE S G L S V F S £ S E E 

^aagcccctaaaa aactggagtacgacagtggtagcctgaagatggaacctgggacttctaagtggcggagggagcggcc^ 

TT7CGGGGATTTTTTGACC TCATGCTG TCACCATCGGAC TTC TACC TTGGACCCTGAAGATTCACCGCCTCCCTCGCCGGAC TCTCGACACTACTAAGTa 



- insert pLMl 



: <4PKKLEY0SG 



-ORFpLM! 



L * " EPGTSKVRRERPESCDO 



CCAAGGGTGGAG AACTGAAAAAGCCCAfCAGCCTCGGCC ACCC TGGTTCCC TGAAGAAG3GC AAGACCCC ACC TG TGGCTGTAAC TTCCCCCATC AC TCA 

ggt tccc acctcttgactttttcgggtagtcggacccggtgggaccaagggacttcttcccgttc tggggtggacaccgacattgaagggggtagtgagt 



-insert pLMl 
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CACAGCCCAGAGTGC CCTCAAAGTCGCAGGCAAACCrGAGGGCAAAGCTACAGACAAGGGTAAGCTTGCAGTG^ArACr GGG-T.-C^CGCTc-T.--- 
GrGTCGGGTCTCA^GAGTTTCAGCGTCCGTTTGGACTCCCGTTTCGATGTCTGTTCCCArTCGAACGlcACTTCTTA^ACCCGAGGTrGCGAG.LG^ 2 ° 



-OHF pLMI 



*VKNTGLOR 



r A Q S A L K V A G K P £ G K A T D IC G K L 

:TGATGCTGGTCGGGACCGCCTGAGTGATGCTA AGAAGCCCCCCrCGGGCATTGCTCGCCCCTCCACTTCGGGATC DTTCGGrTAr £.,\n.$r,arrTrr T , 
AGACTACGACCAGCCCTGGCGGACTCACTACGATT CTTCGGGGGGAGCCCGTAACGAGCGGGGAGGTG AAGCCCTAGGAAGCCGATGTTCTTCGC-AGGA 



1 'Ov'. 



^OA GR QRLSDAKKPPSGlAQfiST SGSFGYKK P P 

CTGCCACAGGCACAGCCACrGTCATGCAAACTGG TG G TTCAGCCACTCTCAGCAAGATCCAGAAGTCCTCAGGCATCC CTGTCAAGCCAr.TAAATr..-r. f v ; 
GACGGTGTCCGTGTCGGTGACAGTACGTTTGACCACCAAGTCGGTGAGAGTCGTTC TAGGTCTTCAGGAGTCCGTAGGGACAGTTCGGTCATTTACCCGC *"* 



PATGTATVHQTG GSATLSKIQKSSG 1P VKPVNGR 

CAAGACTAGCTTA G ATGTTTCCAACAGCGCAGAGC CAGGATTCCTGGCTCCTGGAGCCCGT T CTAACArcCAGTACC GCAGCCTG C CCC,,rrA, r -^-: 
GTTCrGArCGAATCTACAAAGGTTGrCGCGTCTCGGTCCrAAGGACCGAGGACCTCGGGCAAGATTGTA^GTCATGGCGTCGGACGGGGCCGGTc^ 



-ORF pLMI 



^TSLD V SNSAEPG FLAPGARSNIQYRSLPRP^f 

TCAAGrTCTATGAGCGrGACCGGCGGGCGGGGTGGA CCTCGCCCTGTGAGCAGCAGCATTGACCCCAGTCrccr CAGCACCAAGCAGG^GGrrTT,-,- 
AQTTCAAGATAC TCGCACTGGCCGCCCGCCCCACC TGGAGCGGGACAC ^CGrCGTCGTAACTGGGGTCAGAGGAGTCGTGGTTCGTCCC "CCGoAA TGcjj 




CTTCCAGACTGAAGGAGCCTACCAAGGTAGCCAG TGGGCGGACCACTCCAGCCCCTGTCAArCAGACAGArCG GGAAAAGGAGAAGGCCAAAr.rr„,,;- 
p^G^^TGACTTCCTCGGATGGTTCCArCGGTCAC CCGCCTGGTGAGGTCGGGGACAGWAGTCTGTCTAGCCCTTTTCCTCTTCCGGTrTC^uTrCCC 

•insert pLMI 
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Page ( 



ACC AAGC TGGCA 



AGTGGCCTTGGACTCAGACAACArCTCCTTGAAGAGTATTGGCTCCCCAGAGAGTACTCCCAAGAACCAAGCAAGCCACCCC ACAGCC 

tcaccggaacctgagtctgttgtagagg aacttctcataaccgaggggtctctcatgagggttcttggtIcgttcggtggggtgtcgg t^ gacc g 

-insert pLMl 




-ORF pLM1 



V A L . D 5 D 1 5 I * S * G S P E S T P K N 0 A S H P T 



A T K L A 



gagctgccaccaacccctctcagggccacagcgaagagctt 



tgtcaaaccaccctcactagccaatcttgacaaggtcaactcca 



AC AG TCTGGATC7AC 



ctcgacggtggttggggagagtcccggtgtcgcttctcgaaacagtttggtgggagtgatcggttagaactgttccagttgag^ttg tcagacc TAGat:; 

■insert pLM1 



"ORF pLMl 



£ L P P T P L R A T A K SFVKPPSL A N L D K V N 



S N S L 0 L 



CATCATCCAGTGATACCACCCATGCTTCAA AGGTCCCAGATCTGCATGCTACAAGCTCAGCATC tgggggccctc tcccttcctgc ttcacccccagtcc 
gtagtaggtcactatggtgggtacgaagtttccagggtctagacgtacgatgttcgagtcgtagacccccgggagagggaagga^ 32 * : 



psssotthask 



•insert pLM1 
-ORF pLM1 



vpolhatssa 



GGPLPSCFTPSP 



GGCACCCATCCTCAATATTAACTCAGCCAGC TTCTCCCAGGGCCTGGAGCTAATGAGTGGTTTCAGTGTGCCAAAAGAGACCCGC A TGTACCCCAAAC tc 
CCGTGGG ^ A GGAGTTA TAATTGAGTCGGTCGAAGAGGGTCCCGGACCTCGATTACTCACCAAAGTCACACGGTTTTCTCTGGG CG TACATGflftKTTTr;^.-: ^ 




A P { 1 N t N 5 A S F S 0 G L E L M S G F S 



VPKETRMYPKL 



TCAGGCC 



TGC AC AGG AGCA TGG AG TC CC TC C AG A TGCC A ATG4GCCTCCCC AG TGCCTTCCCCAGC AG T AC TCCCGTCCCC AC CCrArrTr;rTrrrrr t.-: 



AGTCCGGACGTGTCCTCGTACCTCAGGGAC^ TCTACGGTTACrCGGAGGGGrCACGGAAGGGGTC GrCATGAGGGCAGGlcTc^GT^: 

■insert pLMl 



CGAGGGGGAC 



•ORF pLM1 



I G L H R S " E S L QMPMSLPSAFPS 



S TPVP TPPAPP 



CTGCrcCCACAGAAGAAGAGACGGAAGAGC TG AC T 



GACGAGGGTGTCTTC T TC TC TGC C T TC TCG AC TG 



TGGAGTGGAAGCCCC AGAGCTGGGC AACTGGACAGTAA TCAGCGGGATCGGA 



AC AC TCTTCCCAA 



AACCTCACCTTCGGGGTCTCGACCCGTTGACCTGTCATTAGTCGCCCTAGCCTTGTGAGAAGGGTT 35CC 




^ Ap TEE£TEELT 



-ORF pLM1 
V SGSPRAGQLD 



5N0RQRNrLP» 
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GAAAGGGC TCAGGTACCAGCTTC AGTCCCAGGAGGAGACCAAGGAGAGGCGACATTCCCATACCATTGG TGGGCTGCC TGAA fCCGAT GACCAG TC AGAc 
CTTTCCCGAGTCCATGGTCGAAGTCAGGGTCCTCCTC TGGTTCCTCTCCGCTCTAAGGGTATGGTAACCACCCGACGGAC TTAGGCTAC TGGTCAGTC T." * " 




-ORF pLM1 



K G L R Y Q L Q S Q E E T K £ R R H S H T 1 G G L P £ S 0 0 0 S E 



CTGCCTTCTCCCCCTGCACTTCCCATG 



TCTCTGAGTGCAAAGGGCCAACTTACCAACATAGTGAGTCCCACTGCGGCCACCACGCCAAG 



AATCACCCGCT 




~ — — ORF pLMl 

j- P S P P A L P H S L S A K G Q L T N I V S P T A A T T P p I T R 



CCAACAGCATCCCCACCCACGAGGCGGCCTTCGAGCrGTACAGCGGCTCCCAAATGGGGAGCACCCTGTCCCTGGCCGAG 



GGTTG TCGTAGGGG 



TGGGTGCTCCGCCGGAAGCTCGACATGTCGCCGAGGGTTTACCCCTC6TGGGAC 



AGACCCA AG GG A A T G A T TC S 
AGGGACCGGCTC TC TGGGTTCCC TTACT AAG 




SNSIPTHEA 



A / E L Y S G S Q ^ G ,5 T L S L A £ R p K G M I R 

GTCAGGATCC 

" * . " ' AGAGGATG 



TTCCGAGACCCCACGGACGATGTTCACGGCTCAGTGCTGTCCCTGGCCTCCAGTGCCTCCTCCACCTACTCCTCAGCTGAGG 



cagtcctaggaaggctctggggtgcctg ctacaagtgccgagtcacgacagggaccggaggtcacggaggaggtggatgaggagtcg"^ tc c r AC 

-insert pLM1 — — 



SG3FR0PT0DV 



-ORF pLMl 



H G S V L S L A SSASSTYSSAEERM 



caatctgagcaaakcggaagcttcgtaggga actggaatcatcccaggaaaaagtggccaccttgacgtctcagctttctg^ 

G TTAGAC TCGTTTACGCCTTCGAAGC ATCCCTTGACCTTAGTAG GG TCCT TTllC &C C r*c.Tar, & ac rr.r lr r/-.- TTTT'TTT iff ' 




° S E Q [ R K L R R E L E S S Q E K V A T L T 



S0L5A-NANL 



CTGCTrTTGAGCAGAGCCTGGTGAATATGACATCCCGCCTGCGACACC TGGCAGAGACGGCCGAGG AGAAGGACAC TG 



TGAGCTGCTGGATTTGCG AG AAAC 




-ORF pLM1 

M , TSR IRHL A£ TA ^EK0TELL0L 



E T 



WO 98/24810 71/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:56 
fig53pLM6 (1>4947) Site and Sequence 



Page J 



AGAAGGAGGTATCGGAGCTGCGCTC TGAGC 



TATGGGAGAAGGAAATGAAGCrTACAGACATCCGCTTGGAGGCCCTCAACTCTG CCCACCAACTGGArCA 
TCTTCCTCCATAGCCTCGACGCGAGACTCGATACCCTCTTCC TTTAC TTCGAATGTCTGrAGGCGAACC TCCGGGAGTTGAGACGGGTGGTTGACC TAG^ 



-U3 stuk 



* K E V f I I J] S £ L VEKEMKLTDIRUE 



-CRF 



A L N S A H Q l 0 0 



GC TTCGGGAGACCATGCACAACATGCAG TTGGAGGTGGACCTGC TGAAAGC AGAGAATGACCGACTGAAGG TAGCCCCAGGCCCCT CATCAGGCTCCA''T 

CGAAGCCCTC tggtacgtgttgtacgtcaacctccacc tggacgac tttcgtctc ttactggctgacttcc atcggggtccggggag tagtccgaggtg! 




LRETMHNMQLEVD 



ccagggcaggtccctggatcat 



LLK A £NDRLK VAPGPSSGST 



vtctgcattatcttccccacgccgctccctaggcctggcactcacccattccttcggccccagtctt gcagacacagacc: 
ggtcccgtccagggacctagtagacgtaatagaaggggtgcggcgaggga ttcggaccgtgagtgcgtaaggaagccggggtcagaacg tctgtgtc tgg 




PGQVPGSSALSSP 



R R S L 6 L A L T H S F G P S L A D T 0 



TGTCACCCATGGATGGCATCAGTACTTGTGGT CCAAAGGAGGAAGTGACCCTCCGGGTGG TGGTGAGGATGCCCCCGCAGCACAT CATCAAAGGGGACT7 
ACAGTGGGTACCTACCGTAGTCArGAACACCAGGTrTCCrCCTTCACrGGGAGGCCCACCACCACTCCTACGGGGGCGTCGTGTAGTA ' 



TAGTTTCCCCTGAA 




LSPWQG i srCGPKEEV TLRV VVRHPPQh I IKGDL 

GAAOCAGCAGGAArTCrTCCTGGGCTGTAGCA AGGTCAGTGGAAAAGrTGACTGGAACArGCrGGATGAAGCTGT TTTCCAAGTGTTCAA^riMrT^T^Tr 
CrrcGrCGTCC-r AAGAAGGACCCGACATCGTTCCAGTCACC TTTT CAACrGA CCTTCTACGACCTACTlcGACA, 

-U3stuk 



AAAGGTTCACAAGTTCC7GATA7AA 



-oa= 



Eavfqvf:-:oy I 



* 0 0 E r F L G C S K V S G K V Q y K m ,. p 

rcrAAAArGGACCCAGCCrCTACCCTGGGACTAAG CACTGAGTCCATCCATGGCTACAGCATCAGCCACGrGAAACGAGTGrT GGATGCAGAGCCCCrr, 
AGATf rTACCTGGGTCGGAGATGGGACCCTGATTC GTGACTCAGGTAGGTACCGATGTCGTAGTCGGTG CACTTTGCTCACAACCTACfiTrTrpr.f.cr.t"^ 



-U3 siuk 



" ~" ORF . 

M 0 P A S j ^ GL S T £ S \ HQ Y 5 I S H V K R V L 0 A E P p 



WO 98/24810 72/270 PCT/EP97/06956 



Tuesday, 18 November 1997 13:57 

fig 53 pLM8 (1 > 4947) Site and Sequence 



Page i 



AGATGCCTCCTTGCCGTCGAGGTGTCAATAACATATCAGTCTCCCTCAAAGGTCTGAAGGAGA^Tfirr. 



TCGACACCCTGGrGTTCGAGACGCTGATCC:- 




CAAGCCGATGATGCAGCAC 



GTTCGGC TAC TACG TCGTGATGTATTCGGAGGACG 



TACATAAGCCTCCTGCTGAAGCACCGGCGCCTCGTCCTCTCGGGCCCCAGCGGCACGGGCAAGACC TArr Tfi&rr a * tv <"^• 




:H;>: 



T^^^ A ^ A ^^^^GAGCGCTCTGGCCGTGAGG TCACAGAGGGCAT^ GCAACTf"** 

— -U3stuk 



-ORF 



L A E * L V E * 5 ° » E V T £ G I V S T F N M H 
ATCTTrCCAACCTAGCCAACCAGATAGACCGGGAAACAGGAArTGGGGATGTGCrrrT^ 



QQSCtCOLQ 



TGATTC TATTGGA TGACC TGAG TGA ACCAGGC TCCA TC AG 




E L V N G A 



<■ T C K Y H K C P 



* I I G T T N Q 



PVKMTPNHG 



CACTTGAGCTTCAGGArGTTGACCrTCrCCAACAACGTGGAG CC 
GTGAACTCGAAGrcCTACAACTGGAAGAGGTTGTTGCArrrr^ 



AGCCAATGGCTTCCTGGTTCGTTACCrGAGGAGGAAGCTGGTAn^rr.r: 



ACAGCG 




M <-SFR MUTF 



S N N V E p 



ANGFLVRy 



WO 98/24810 73/270 PCT/EP97/069S6 



Tuesday. 18 November 1997 13:57 
KgS3pLM6 (1 >4947) Site and Sequence 



Page £ 



AC ATCAArGCCAACAAGGAAGAGCTGCTTCGGGTGCTCGAC ^GGG^^CCCAAGCTGTGGTATCATCTCCACACCTTCCTTGAGAAG CACAGCACCTCAG'i 
TGTAGTTACGGTTGTTCCTT"CTCGACGAAGCCCACGAGCTGACCCATGGGrTCGACACCATAGTAGAGGTG TCGAAGGAACTCTTCGTCTCGTuGAGTC*" 




° 1 N A N K E . £ L L .« v t 0 V V P K L V Y H L H T F L £ K H S T 3 D 
CTTCCTC ATCGGC CC TTGCTTCTTrCTGTCGTGTCCCATTGGCATTGAGGACTTCCGGACCTGGTTCATTGACCTGTGGA^ TC TATCA TTCCCTAT 

gaaggagtagccgggaacgaagaaagacagcacagggtaaccgtaactcctgaaggcctggaccaagtaactggacaccItgttg^ 



-U3 stuk 



-ORF 



F > l - G P C / F L S C P ! . S [ £ .0 F R T V F I Q L y N N s t , p y 

ctacaggaaggagccaaggatgggataaaggtccatggacagaaagctgcttgggaggacccagtggaatgg^ 



GATGTCCTTCCTCGGTTCCTACCCTATTTCCAGGTACCTGTCTTTCGACGAACCCTCCTGGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCGGTAGT 




t 0 £ G A K 0 G I K V H G 0 K A A V £ 0 p y E V V R p T L P V P 3 

CCCAACAAGACCAATCAAAGCTGTACCACC TGCCCCCACCCACCGTGGGCCCTCACAGCATTGCCTCACCTCCCGAGGATAGGACAGTCA AAGACAGCAC 
GGGTTGTTCTGGTTAGrTTCGACArGGTGGACGGGGGTGGGTGGCACCCGGGAGTGTCGrAACGGAGTGGAGGGCTCCTATCCTGTCA G lT7CTGTCGT- ^ 




AQQQQS K i V H L P PPTVGPHS I ASPPEDR X V K 0 S T 

CCCAAGTTCTCTGGACTCAGATCCTCTGA TGGCCATGCTGCTGAAACTTCAAGAAGCTGCCAACTACATTGAGTCTCCAGATCGAGAAAC CATCCTGG^ 
GGGTTCA,GAGACCTGAGTCTAGGAGACTACCGGTACGACGACTTTGAAGTTCTTCGACGGrTGATGTAACTCAGAGGT^AGCTCTTTar.T,,,,.--,-T' ^ 




P S 3 L 0 S D P L M A M L L K L Q E A A N r , E S p Q B E r , L 0 
CCC AACC T fCAGGCAACAC rTTAAGGGTTCGGCAATCAC TGTCACCCCCGGACAGCAG AACGC TGGC ATCAGC TATCTTAGC rCCTCCTCTCCCCTCTCC 




•IK 



-ORF 

3 M L 0 A T L 



G F s nhchprtaervhols 



- L L 3 P L 



WO 98/24810 74/270 PCT/EP97/06956 



Tuesday. 1 8 November 1 997 1 3:57 D / 

fig 53 pLM6 (1 > 4947) Site and Sequence Page * 

TCTTTCAGAGCACTGGCTC |CCAGCCCCAGGAGGAGAACAGGAGGGAGGAGGAGATGAAAGAGGAGGGACAGGTTCTTGGTGC TG TACC T T TGAGAAC T ' 
AGAAAGTCTCGTGACCGAGAGGTCGGGGTCCTCCTCTTG TCC TCCC TCCTCCTCTACTTTCTCC TCCCTGTCCAAGAACCACGACATGGAAACTrTTfiAl ^ 



-U3 stuk 



LFQSTGSPAPGGEQEGGGOER 



6GTGSWCC TFENF 



CCTAGGAAGGAATGGTGGGGTGGCGTTTGGGAACTTGTGCCCCCTAAACACATTTAC TGGCC TCCTCTAGAGCGGCCGCCACCGCGGTGGAGC TCCAATT 

ggatccttccttacc accccaccgcaaacccttgaacac gggggatttgtgtaaatgaccggaggagatctcgccggcggtggcgccacctcgaggttaa £3K 

U3stuk — _ 1 

I G ft N G G V A F G N L C P L N T F T G L L . S G R H R G G A P | 

CGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAAC TTAATCGCCTTGCAGC 

GCGGGATATCACTCAGCATAATGCGCGCGAGTGACCGGCAGCAAAATGTTGCAGCACTGACCCTTTTGGGACCGCAATGGGTTGAATTAGCGGAACGTCG 
R P ' V S R f T R * . H V P S F Y N V V T GKTLALPNL IALQ 



ACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGGACGCGCCCTGT 

TGTAGGGGGAAAGCGGTCGACCGCATTATCGCTTC tccgggcgtggctag cgggaagggttgtcaacgcgtcggacttaccgcttaccctgcgcgggaca 

H ' P <- S P A G V I AKRPAP IALP NSCAA . M AN G T R P V 

agcggcgcattaagcgcggcgggtgtggtggttacgcgcagcgtgacccctacacttgccagcgccctagcgcccgctcctttcgctttcttcccttc c- 
tcgccgcgtaattcgcgccgcccacaccaccaatgcgcgIcgcactggcgatgtgaacggtcgcgggatcgcgggcgaggaaagcgaaagaaggcaagga " 6i>: 

AAH . • AR BVVVLRAA . PLHLPAP . RPLLSLS S l_ P 
TTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACr 

aagagcggtgcaagcggccgaaaggggcagttcgagattIagcccccgagggaaatcccaaggctaaatcacgaaatgccgtggagctggggttttttga " 7; '' 
FSP RSPAFPVKL . iggsl ■ g s o l v l y g t s t p k h 

tgattagggtgat ggttcacgtagtgggccatcgccctgatagacggtttttcgccctttgacgttggagtccacgttctttaatagtggactcttgttc 

ACTAArccCACTACCAAGTGCATCACCCGGTAGCGGGACTATCTGCCAAAAAGCGGGAAACIGCAACCTCAGGTGCAAGAAArTATCACCTGAGAACAAU 
L ' * V " V H - V V G H » P 0 » " F. F A L . R V S P R S L I V D S C 3 

CAAACfGGAACAAC ACrCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATrTCGGCCTATTGGTTAAAAAATGAGCrGATTTAA: 
GT"rTGACCTTGTTGTGAGTTGGGArAGAGCCAGATAAGAAAACTAAATATTCCCTAAAACGGCTAAAGCCGGATAACCAATrTTTTACTCGACrAAATT3 ^ 
K L E 0 H S T L S R S I L L tYKGFCRFRp I G . K M S . F U 



AAAAATTTAACGCG AATTTTAAC AAAA TA TTAACGCTTAC AATTTAG 
TTTTTAAATTGCGCTTAAAATTGTTTTATAATTGCGAATGTTAAATC 
* N L T RILTKY.RUQFR 



WO 98/24810 75/270 PCTYEP97/06956 



Tuesday. 1 8 November 1 997 1 3 :57 . 
fig$4pLM1 (1 >d2S5) Site and Sequence Pa $" 1 

Enzymes : 72 of 148 enzymes (Filtered) 

. Sellings : Circular, Certain Sites Onty. Standard Genetic Code 

GTGGCAC T T rTCGGGGAAATGTGCGCGGAACCCC TATTTGT ITATTTTTC TAAAT ACATTCAA ATATGT ATCCGC fCATGAGAC AA TAACCC TGATAAAT 

CiCCGTGAAAAGCCCCTTTACACGCGCCTTGGGGATAAACAAATAAAAAGAnTATGTAAGTTTATACATAGGCGAGTACTCTGTTArTGGGACTArTTA ,0 ° 
G G T F R G N V ft G T P t C L F F . 1 H 5 N H Y P L M R Q . p . M 

GCTTCAATAATA rTGAAAAAGGAAGAGTATGAG ^ATTCAACATTTCCGTGrCGCCCTrATTCCCTTTTTTGCGGCATT TTGCCTTCCrGTTrTTGCTCAC 

CGAAGTTATTA7 AACTrTTTCCTTCTCATACTCATAAGTTGrAAAGGCACAGCGGGAATAAGGGAAAAAACGCCGTAAAACGGAAGGACAAAAACGAGTG ^ 
L Q •. Y : K ft K S li S | Q H F R V A L I P F F A A F C L P V F A H 

CCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACrGGATCTCAACAGCGGTAAGATC CTTGAGAGTT 
GGTCTTTGCGACCACTTTCATTTTCTACGACTTCTAGTCAACCCACGTGCrCACCCAArGTAjCTTGACC TAGAGTTGTC6CCATTCTAGGAACTCTCAA ^ 
PET LV KVK0A EDQ LGARVGY1EL DL N S G K I t E S 

TTCGCCCCGAAGAACGTTTTCCAATGATGAGCACrTTTAAAGTTC TGCTATGTGGCCCGGTATTATCCCCrATTGACGCCGGCCAAGAGCAACT CGGTCG 

AAGCGGGGCTTCTTGCAAAAGGTTACTACrCGTGAAAATTTCAAGACGArACACCGCGCCATAATAGGGCArAACTGCGGCCCGTTCTCGTTGAGCCAGC ^ 
FftPEE ftF Prttl STFKVLLCGA VLSRI DA GQEQL G R 

CCGCAfACACTATTCTCAGAATGACTTGGTTGAGrAC TCACCAGTCACAGAAAAGCATCTTACGGA TGGCATGACAGTAAGAGAATTATGCAGT GCTGCC 
GGCGTATGfGATAAGAGTCT TACTGAACCAACTCATGAGTGGTCAGTGTCTTTTCGTAGAATGCCTACCG TACTGTCATTCTCTTAATACGTCACGACGG 500 
* 1 H V S Q P L V g Y S P V T E K H L T 0 G M T V R E L C S A A 

ATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTT^ 

rATTGGTACrCACTATrGTGA^GCCGGTTGAATGAAGACTGTTGCTAGCCTCCTG6CTTCCTCGATTGGCGAAAAAACGTGTTGTACCCCCTAGTAC ^ 
I T H S 0 N T A A N I t L T T I G G P * £ L T A F t H M rl G 0 H V 

CrCGCCTTGAT CGrrGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCG TGAC ACC ACGATGCC TG rAGCAATGGCAACAACGTTGCGCAAAC T 
GAGCGGAACTAGCAACCCTTGGCCrCGACrTACTTCGGTATGGTTTGCTGCTCGCACTG TGGTGCTACGGACATCGTTACCGTTGTTGCAACGCGTTTGA 
F R U 0 R V E P £ L N E A I P N Q £ P Q T T M P y A M A T T L R K L 

ArTAACTGGCG AACTACTTACrCTAGCTTCCCGGCAACAATrAATAGACTGGATGGAGGCGCArAAAGTTGCAGGACCACTTCTGCGCTCGGCC CTTCCG 
TAATTGACCGCTTGATGAATGAGATCGAAGGGCCGTTGTTAATTArCTGACCTACCTCCGCCrArTTCAACGTCCTGGTGAAGACGCGAGCCGGGAAGGC ^ 
L f G E L C T L A S R Q Q L 1 D VME APKV AGP LLR S A L P 

GC TGGCTGGTTT ATTGCTCATAAATCTCGA6CCGGTGAGCCTGGG T'CTCGCGGTATCATTGC^SCACTGGGGCCAGATGGTAAGCCCTCCCGrATCGTAG 
C ; iACCGACCAAA TAACGACTaTTTA^ACC ^CGGCCACrCGC ACCC AGAGCGCCATAGTAACCrCGTGACCCCGGTCTACCATTCGGGAGGGCATAGCATC 9 °° 
AGV FjAOKSG AGE RGSRGll AAL G P 0 G K P S ft I V 

' *** ^ A ^^k* ^^*^ A ^|CAGGCAAC TATGGATGAACGAAATAGACAGArCGC TGAGATA jGTGCC TCACTGATTAAGCATTGG TAAC TG TCAGACCA 
AATAQATGruCTGCCCCrCAGTCCGTTGATACC rACTrGCTTrATCTGTCrAGCGACrCTAT^CACGGAGTGACTAATTCGTAACCATTGACAGTCTGGT ^ 

7 1 t gsoa r h o e r nr Qi a s t gasl ikhv . l s d q 

AGTT rACrCATAT ArACTTTAGATTGATrrAAAACTTCATTTTTAATTTAAAAGGATCrAGGrGAAGArCCTTTTTGATAATCrCATGACCAAA ATCCCT 

icaaa tgagtat atatgaaaictaac iaaattt tgaagtaaaaat taaat tttcc fagatccacttctaggaaaaactattagagtac tggt tttaggga U0 ° 

7 Y * ' 1 L • 1 0 I « 1 " r • P « A 1 . V K J L F Q N L M T K t P 

fAACOTGAGrTT TCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGArCTTC TTGASATCCTTTTTTTCTGCGCGTAATCTGCTQCTTGCAA A 
ArTGCACTCAAAAGCAAGGTGACTCGCAGTCTGGGGCArCTTTTCTAGTTTCCTAGAAGAACTCTAGGAAAAAAAGACGCGCArTAGACGACGAACGTT! ,200 
<REF $ FM - A S PPV,EKIKGS S OP 7 FLRV1C C L 0 

CAAAAAAACCACC GCTACCAGCGG7GGTTTGTTTGCCGGATCAAGAGCTACCAAC TCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATaCCAAA 
GrTTrTTTGGTCGCGArGGTCGCCACCAAACAAACGGCciAGrTCTCGAlGGTTGAGAAAAAaSCTrCCATTGACCGAAGTCGTCTCGCGTCTArGGTTi l3 °° 
^K»CPP LPAVV CLPOQ£LPTL FP KVrCFS R A Q I p N 

MCrGfCCTTCTA GTG TAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGT.13CACCGCCTAJATACC TCGCTCTGCTAATCCTGTTACCAG fGGCTGC T 
A '^ A< " A ^**^ A ^ A ^ A ^^^^^ A ^C**^CCGCT5GTGAAGTTC TTGAGACATCGrGGCGGAT j TATCGAGCGAGACGAr TAGGACAATGGTC ACCGACGA 
T ! L U V • * ; I G " H F K N SVAPPTYLALtlLLPVAA 



WO 98/24810 76/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:57 

Kg 54 ptMl (1 > 8285) Sfto and Sequence 

CCCAGrGGCCATAAOTCGTGrCTUCCC^ 

CGGTCACCGCTATTCAGCACACAATGGCCCAACCTCAGrTCrGCTATCAATGGCCTATTCCGCGTCGCCAGCCCGACTTGCCCCCCAAGCACGTGrGTCG ,S °° 
a S G 0 K S C L T G L 0 S ft R . I P 0 K A 0 R S G . T G G S C T Q 

CCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGrGAGCTATGAGAAAGCGCCACGCTrCCC 

GGTCGAACCTCGCrTGCTGGArGTGGCTTGACTCTATGGATCTCGCACTCGATACTCTTTCGCGGTGCGAACCGCTrCCCTCTTTCCGCCTGrCCATAGG ,6 °° 
P 3 I £ R T T Y T £ L R Y L OREL . E J A T L P E G R K A p R y p 

GGTAAGCGGCAGGGTCCGAACAGGAGAGCGCACGACGGAGCrTCCAGGGGGAAACGCCrGGTATCTTTATAGTCCTGTCG GGTTTCGCCACCTCTGACTT 
CCATrCCCCGTCCCAGCCTTGTCCTC TCGCGTGCTCCCTCGAAGGrCCCCCTTTGCGGACCATAGAAATATCAGGACAGCCCAAAGCGGTGGAGACTGAA ,7 °° 
V SG RVG TGeRTR£ UPG G NAVYLYS P V G F R M L . L 

GAGCGTCGATTTTTGTGATGCrCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGG CCTrTTG 
C TCGCAGC TAAAAACACTACGAGC AGTCCCCCCGCCTCGGA TACC TT TTTGCGG TCCTTGCGCCGGAAAAArcCCAAGGACCGGAAAACGACCGGAAAAC 
E R R / L , ■ C S S C G R S L V K M A S N A A F L R F L A F C V P F 

CfCACArGTrCTTTCCTGCGTTATCCCCTGATTCrGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGA TACCGCTCCCCGCAGCCGAACGACCGAGCQ 

gagtgtacaagaaaggacgcaataggggac taagacacctattggcataatggcggaaactcactcgactatggcgagcggcgtcggc ttgctggctcgc 1900 

t a f e , a q tarrsrtter 



AHM FFPAL3P0SV0NR 



CAGCGAGTCAGTG AGCGAGGAAGCGGAAGAGCGCCCAArACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAArGCAGCTGGCACG ACAGGTTT 

GrcGCTCAGTCACTcccrcc ttcgccttctxgcgggttatgcgtttggccgagaggggcgcgcaaccggc taagtaattacgtcgaccgtgc tgtccaaa 2000 

5ESVSE EAEERP IRKPPL PARVP IH . CSVHORF 



CCCGAC TGG AAAGC gggcag 



rGAGCGCAACGCAATTAArGTGAGTTAGCTCACTCATTAGGCACCCCAGGCrTTACACTTTATGCTTCCGGC 



TCGTATGT 



GGGC ^GACCTTTCGCCCGTCACrCGCGTTGCGTTAATTACACrCAArCGAGTGAG TAATCCGTGGGGrCCGAAATGTGAAATACGAAGGCCGAGCATACA 2 '°° 
» 0 V K A G 5 E R w A | H V S . L T H . A P Q A L H F H L P A R M 

TGTGTGCAATTGTG ACCCGATAACAATTTCACA^ 

ACACACCTTAACACTCGCCTArTGrTAAAGTGTGTCCTTTGTCGArACTGGTACTAATGCGGTTCGCGCGTTAATTGGGAGTGATTTCCCTTGTTTTCGA 22 °° 
L C ° ' V S G * Q F , H T Q " S 7 P H D Y A K R A I N P H . R £ 0 K I 



GGGT ACCGGGCCCC CCCTCGAGGTCGACGGTATCGATAAGC ttca ^ATCGAATTCCTGCAGCCCCTGCTCTTCAGCCAGATGCTGGACCCAGAGTCCCAG 
CCCArGGCCCGGGGGGGAGCTCCAGCTGCCATAGCTATTCGAACT ATAGCrTAAGGACGTCGGGGACGAGAAGTCGGTCrACGACCTGGGTC TCACGGTC 



2300 



-insert pLMt 



G TGPPLEVOG 10 



KLO ICFLQPLLFSQ 



ORFpLMl 

M L D P E S 0 



AGAA AGAGGACAGTGCAGAA TQ 



TCCTGGATCTCCGGCAGAACCTGGAAGAGACCATGTCCAGCCTGCGAGGGTCCC 




* K * T V ° N V L 0 R 0 M L E E T M S S L R G 5 Q V T H S S L E 

TCACCTGC ^ A ^G^ A ^^G A ^GATGCCAACCCACGCAGCGTGrcCAGCCTCTC CAACCGCTCGrCCCCrCTGTCA TGGCGC TATCGCC ACTCCAGTCCGCG 

CAGGTCAGGCGC 



ACTGGACGATGCTGTCGCTACTACGGTTGGGTCCGTCGCACAGGTCGGAGAGGTTGGCGAGCAGGGGAGACAGTACCGCGATACCGG7 



2500 



Page V 



-insert pLMi 



' OfiFpLMl . 

" T C Y Q S Q Q A N p R s y 5 ^ y g 2 S S P R 



WO 98/24810 77/270 PCT/EP97/06956 



Tuesday, 18 November 1997 13:57 
(!o54pLM1 <1>8265> Site and Secuence 



-insert pLMl 



-ORF pLMl 



L Q ft G 0 A P S V G G S C R S E G T P A V V M H G E R A H Y S H T 

ArGCCCATGCGCAGCCCCAGCAAGCrCAGCCATATCTCCCGCCTGGAGCTGGTCGAATCCCTGGACTC GGArGAGGTGGACCTCAAGrcCGGCTACATGA 
TACGGGTACGCGTCGGGGTCGTTCGAGTCGGTATAGAGGGCGGACCrCGACCAGCTTAGGGACCTCAGCCTACTCCACCTGGAGTTCAGGCCGATGTACT 



2700 



-insert pLMl 



-OHFpLMI 



* P M R S P S K L 3 H 1 S R L £ L V E $ LOSOEVDIKSG YM 

GCGACAGTGACCrCATGGGCAAGACCATGACGGAGGATGATGACATCACrACCGGCTGGGATGAAAGCAGCTCCArCAGTAG TGGAC TCAGCGATGCCTC 
CGCTGTCACTGGAGTACCCGTTCTGGTACTGCCTCCTACTACTGTAGTGArGGCCGACCCTAC TTTCOTC 6AGGTAGTCATCACCTGAGTCGCTACGGAG 



2800 



-insert pLMl 



S0S0LMGKTHTEO0O 



-OflFptMl 



TTGVOESSS ISSGLSD 



A S 



AGACAATC ^^AGrrCAGAAGAATTCAATGCCAGCTCCrCACTCAACTCCCTCCCAAGTACTCCCACTGCTTCTCGCAGGAACTCA ACAATAGTGCTACGC 
TCTGTTAGAGTCAAGTCrTCTTAAGTTACGGTCGAGGAGTGAGTTGAGGGAGGGTTCArGAGGGTGACGAAGAGCGTCCTTGAG TTGTTATCACGATGCG 

- insert pLM 1 



2900 



Pago S 



GC TGCAGGCTGG TGACGCGCCC TCTGTGGG TGGGAGC TGCCGC TCGGAGGGCACCCCCCCCTGGTACATCC ACCGCCAACCGGCCCaC TAC TCCCACACC 
CGACGTCCGACCACTGCGCGGGAGACACCCACCCTCGACGGCGAGCCTCCCCTGCGGGCGGACCATGTACGrGCCGCTTGCCCGGGTGATGAGGGrGTGG 28 °° 



DNL SSE EFNASSSLNS LPSTPTASRRNST IVLR 



acagactcagagaagcgctcactggcagaaagtggcc TGAGCTGGTTTAGTGAATCAGAGGAGAAAGCCCCTAAAAAACTGGAGTACGACAG tggtagcc 

TGTC TGAGTCTCTTCGCCAGTGACCGTC T"f TCACCCGAC TCGACCAAATCACTTAGTCTCCTCTTTCGGGGATTTTTTGACCTC ATGCTGTCACCATCCG *** 




TOSEK RSLAE SGL SVFS ESEEKAPKKLETOS 



rGAAGATGGA^CC TGGGACTTCrAAGTGCCGGACGGAGCGGCCTGAGAGCrGTGATGATTCATCCAAGGGrGGAGAACTGAAAAAGCCCATCAGCC FGGG 

AC* T TC TACC T TGGACCCTGAAGAT TCACCGCCTCCCTCGCCGGAC TCTCCACACTACTAAGTAGGTTCCC ACCTC TTCAC TTTTTCGG6TAC TCGGACCC 3, °° 



-insert pLM1 



"ORF pLMl 



t * M E P G T S K V R R £ R P £ $ C D PSSKGGELKKP I SLG 



CCACCCTGGTrcCCT 



CTGAAGAAGGGCAAGACCCCACCTGrGGCrGTAACTTCCCCCATCACTCACACAGCCCAGAGTGCCCTCAAAGTCGCAGGCA AACC T 
GGIGGGACCAAGGGACTTCTTCCCGTTCTGCGG rGGACACCGACArTGAAGGGGG TACTGACTGTGTCCGGTCTCaCGGGAGTTTCAGCGTCCCTTTGGA ^ 




GAGGGCAAAGCT ACAGACAAGGGTAAGCTrGCAGTGAAGAATACTGGGCTCCAACGCTCCTCCTCTGATGCTGGTCGGGACCGCCTGAGTGATGCTAA GA 
CTCCCGTT TCGATGTC TCTTCCCATTCGAACGTCAC T TC TT ATGACCCCAGCTTGCCAGGAGGAGAC TACGACC AGCCCTGGCGGAC TCAC T ACGAT TC T 33 °° 



-ORF CLM1 



G K A T 0 K G X L A V K H T GLORSSSOaGRORl 



SOAK 



WO 98/24810 



78/270 



PCT/EP97/06956 



Tuesday, 18 November 1997 13:57 
<taS4pLMl <1>8285) Sit* and Seoueneo 



Pag»- H 



AGCCCCCCTCGGGCATTGCTCGCCCCTCCACTTCGCGArCCrTCGGCTACAAGAAGCCrCCTCCTGCCACAGGCACAGCCAC 



rCGGGGGGAGCCCGTAACGAGCGGGGAGGTGAAGCCCTAGGAAGCCGATGTTCTTCGGAGGAGGACGGTGrCCGTGTC 



TGTCAfGCAAAC TGG TGG 
CG TGACAGTACGrTTGACCACC 



3<i00 



-insert pLMi 



-ORFpLMl 



K P P S G I A R P S T S G S f G Y K K P P P A T G T A T V M 0 T G G 

rrCAGCCACrCTCAGCAAGATCCAGAAGTCCTCAGGCATCCCTGTCAAGCCAGTAAATGGGCGCAAGACTAGCTTAGATGTTTCCAAC AGCGCAGAGCCA 
AAGTCGGTGAGAGTCGTTCTAGGTCrTCAGGAGfCCG TAGGGACAGTTCCGTCAT TTACCCGCGTTCTGATCGAAfCTACAAAGGTTG TCCCGTCTCGGT 



3500 



-insert pLM! 



~ — ORFpLMl . . 

S , A T . L S * 1 Q * S S G t P V K P V N G R KTSLOVSNSAEP 

G^ATrCCTGGCTCCTGGAGCCCGTTCTAACATCCAGTACCGCAGCCTGCCCCGGCCAGCCAAGTCAAGTTCTATGAGCGTGACCGGCGGGCGGGGTGG AC 
CCTAAGGACCGAGGACCTCGGGCAAGATTG TAGGTCA TGGCGrCGGACGGGGCCGGTCGGTTCAGTTCAAGATACTCGCACTGGCCGCCCGCCCCACCTG 



+ 3600 




-ORFpLMl 



G F L A P G A R S N | Q Y R S L P R P A K S S S M S V T G G R G G 

CrCGCCCTGrGAG CAGCAGCATTGACCCCAGTCTCCTCAGCACCAAGCAGGGAGGCCTTACGCCTTCCAGACrGAAGGAGCCT ACCAAGGrAGCCAGTGG 
CAGCGGCACACTCGTCGTCG tAACTGGGGrCAGAGGAGTCG TGGTTCGTCCCTCCGGAATGCCGAAGGTC TGACTTCCTCGG; 




3700 



P R P V S S S 1 D P S L L S T K Q c 6 L T p s R L £ v a a c 

gcggaccactccagcccctgtcaatcagacaca ^CGGGAAAAGGAGAAGGCCAAAGCCAAGGCAGTGGCCTTGGACTCAGACAACATC tccttcaagagt 




3800 



P T T ? A P V U 0 T 0 » £ < £ * A K A K A V A L 0 S D H 1 S L K S 
.7TGGCTCCCCAGA GAGTACrcCCAAGAACCAAGCAAGCCACCCCACAGCCACCAAGCTGGCAGAGCTGCCACC AACCCC TC TC AGCGCC AC AGCQA AG A 




3900 



G SP £ S TPKW QASHP T A TK L A E L P P T P L R A T AK 
G JTTrGrCAAACC ACCCrCACrAGCCAATCTTGACAAGGTCAACTCCAACAGTCrGGATCTACCATCATCCAGTG ATACCACCCATGCTTCAAAGGrCCC 

r aj>rif.TrT'.'fr/;riifT>-. .- _ _ . - . ' 1 ' 1 "" "" * I ■ ■ i i I i i t 




S I V < ? P S L A N L 0 K V N S H S L 0 L P S S S 0 T T H A S K V P 
^£JCC^C£TACA^ 

rcrAGACGrACGArGrrCGAGrCGTAGACCCCCGGGAGAGGGAAGGACGAAGTGGGGGrCAGGCCGrGGGTAGGAGTTArA^ 



moo 




I'lhatssasggpl pscf r p s p a p i ln i n s a s f s 



WO 98/24810 79/270 PCT/EP97/06956 



Tuesday. 18 November t997 I3:S7 
figS4pCM1 n>82S5) Site and Segues 



C^GGGCCTGGACCTAATGAGfCCTTTCAGTCT GCCAAAAGAGACCCGCATCTACCCCAAAC TC TCAGGCCTGCACAGGAGCATGG AGTCCC TCCAGATGC 
GrCCCGGACCTCGATTACTCACCAAAGTCACA A GGTTTTCTCrGG6CGTACATCG6GTTTGAGAGTCCGGACGTGrCCTCGTACCTCAGGGA G GT C r A r^ * 2 °° 




Q S t E L M S G F S V P K £ TP 



M Y P * l S Gl. HRSMESLO 



CAATGAGCCTCCCCAGrGCCTTCCCCAGCAGTAC rCCCGTCCCCACCCCACCTGCTCCCCCTGCTGCTCCCACAGAAGAAGA GACGGAAGAGrrnarTT. 
GTTACTCGGAGGGGTCACGGAAGGCGTCGTCATGAGGGCAGGGGTGGGGTGGACGAGGGGGACGACGAGGGrGTCTTCTTCTC TGCCf TCTCGACTGAAC ^ 




p H SLPSAFPSST 



-ORFpLMl 

pyp tppappaapt e e E t e e l t v 



GAGTGGAAGCCCC AGAGC TG6GCAAC TGGAC AG 

CrCACCTTCGGGGTCTCGACCCGTTGACCTGTCATTAGTCGCCCTAGCCTTGTGAG, 



TAATCAGCGGGArCGGAACACTCTTCCCAAGAAAGGGCrCAGGTACCAGCrTCAGTCCCAGGAGG AG 
iAAGGGTTCTTrcCCGAGTCCATGGTCGAAGTCAGGGTCCTCCTC 




U«JOO 



-ORFpLMl 

S G 3 P P A G 0 L 0 SNQRORMrLPK 



* G L R Y0L0S0EE 



ACCAAGGAGAGGCGACAT 



TGCTrCCTCrcCGCTCTAAGGGTATGGT 



TCCCATACCATTGGTGGGCrGCCTGAATCCGArGACCAGTCAGAGCTGCCTTCTCCCCCTGCAC 



AACCACCCG ACGG AC TT AGGC T AC TGG TC AG TC TCGACGGAAGAGGGGGACG TG AAGGG TAC 



TTCCCATGTCTCTGACTG 



AGAGACTCAC 



M500 




T <CRR HSHTI GGL PESDD 0 S E L 

CAAAGGGCCAACTTACCAACATAGTGAGTCCCACTGCGGCCACCACGCCAAGAATCACCCGCTCO 
GTTTCCCGGrTGAATGGTTCrATCACTCAGGGTGACGCCGGTGGTGCGGTTCTTAGTCGGCGAGGTTGl? 



p s PPALPMSls 



AACAGCATCCCCACCCACG AGGCG6CCTTCGAGCT 
GTAGGGGTCGGTGC TCCGCCGGAACCTTGA 



<»60O 




A < o o i r n i v s 

G f AC AGCGGC TCCCAAATGGGGAGCACCC 



-ORF pl.Ml 



■ P T A » T T P R ' ' R S H S 1 P T H £ a A F E L 

rGTCCCrGGCCGAG AGACCCAAGGGAATGATTCGGTCAGGATCCTTCCGAGACCCCACGGACGATGrrCAC 
AAGCCAGTCCTAGGAAGGCTCTGGGGTGCCTGC IACAAGTG 



CArGrCGCCGAGGGTTTACCCCTCGTGCGACAGGGACCGGC TC TC TGGG T TCCC T T AC T 




4700 



r S G S 0 M G S T 



LSLAEBP *CH|R S G SFROP TOOV H 
GGCTCAGTGCTGrcCCrGGCCTCCAGTGCCTCCrCCACCTACrCCTCAGCTGAGGAGAGGATGCAATCTn^r, 



CCGAGTCACGACAGGGACCGGAGGTCACGGAGGAGGTGGATGAGGAG TCGACTCCTCTCCTACGTTAGAC 



C AAA TCCGGAAGC T TCG T AGGGAACTGG 
rCGTTTAGCCCTTCGAAGCATCCCTTGACC 




A^ E E R M Q S EOIRKLRREL 




Pag*. S 



WO 98/24810 80/270 PCT/EP97/06956 



Tuesday, 18 November 1997 13:57 
tlqS4DLM1 n >e2RS) Site and S.cuan^ 




-ORF pLMl 



CCAGTCATTCACGCAGCCCTTAATGCCTCAGAAACCACACCCAAACAACTTCGCATCAAGAGACAAAiirT ^ 



TCCTCAGATAGCATC TCAAGCC TCAACAGCA 




5100 



rCACrAGCCATTCCAGCATCGGCAGCAGCAAGGArGCTGATGCGAAAAAGAAGAAAAAAAAGAG TTGGGTC 
AGTGATCGGTAAGGTCGTAGCCGTCGrCGrTCCTACGACTACGCTTTTTCTTCTTTTTTTTCTC 



JATGAGCTTCGAAGTTCCTTCAACAAAGC 




5200 



J T 5 H S 3 I G S S 



-ORFpLMI 

kpadakkkkkk s v V V t L R s S f 



GTTCAGTATAAAAAAGGGGCCCAAGTCAGCTTCCT CA 
CAAGTCATATTTTTTCCCCGGGTTCAGTCGAAGGAGT 



rACTCGGATATAGAGGAGATTGCTACACCCGACTCTTCAGCCCCCTCATCCCCCAAACTACAG 




5300 



^TGGTTCCACAGAGACTGCTTCACCCTCCATCAAGTCCTCCACCrTGTCCTCCGTGGGCACTGATG 
GTACCAAGGrGTCTCTGACGAAGTGGGACGTAG rrCAGGAGGrGGAACAGGAGGCACCCGTGACTACAGTGGCTCrCRi 



TCACCGAGGGCCCTGCTCACCCAGCCCCCCACA 



GGACGAGTGGGTCGGGGGGTGT Sq °° 




M S S T E T A S P S 



-ORF pl.Ml 

lKSST t.SSVG TOVTEGPAHP 



CTAQGC ^^^^^^ A ^^^ A ^^QAGGAX»GAGGAGCCAG AGAAGAAGGAGGTArCGGAGCTGCGCrCTGA GCTArGGt;afiAArr.AAATr A^ffr- 
GATCCGACAAGGrACGrrTACTCCTCCTCCTCGGrC TCTTC TTCC TCCA rAGCC TCGACGCGAGAC TCGArACCCrCTTCCTTTACT 



rr AC AG AC AT 



TCGAA TGTCTGTA 



5500 




-ORF pLMl 



' B L f H A w E E E e y E " " ^ s t , r 5 E L y e K t „ K L r D , 

"T!r cCTCAACTcrGCCC ' cc " CTCgAT ^^^ 




S700 



a L K V A P 



CPSS GSfPSQvpGs S A l S S P P P s ,. 



CCTGCCACACCTGGCAGAGACCGCCGACCAOAAGG ACACTCAGCTGCrGGArTTGCGAGAAACCATACACrrrCT GAACAAAAflGAAC ,CTGA G GCCCAG 
GGACGCTGTGGACCGTCTCTGCCGGCTCCreTTC CTGTGACrCGACGACCrAAACGCTCnTGGTArc TGAAAGACTTcrTTTrCTIGAOACTCCGGCt 5000 



WO 98/24810 81/270 PCTYEP97/06956 



Tuesday, 18 November 1997 13:57 
4j ft >82aS) SrteandJ 



AGTGGCTAAG%«AAGCCGGGGTCAGAACGTCTGTGrCTGGAC AGTGGCTACCTACCGTASTC ArGAACACCAt'nTTTrrT > ™» 




LT H STGPSLAOT 



-ORF pLMl 
0 L 5 P H 0 G 



5TCGPKCEVT 



GGTGAGGATGCCCCCGCAGCACATCA 



L ft V V 



CCAC TCCT ACGGGGGCG fCG TG TAG TAG TT TCCCC 



TCAA AGGGG AC TTGA AGCAGC AGG AATTC TTCC TGGGC TG 



TGAACTTC6TCGTCCTTAAGAAGGACCCGACATCGTTCC 



TAGCAAGGTCAG TGGAAAAGTTGACTGGAAGATG 



AGTCACCTTrTCAACTGACCTTCTAC 



5900 




c. W r^cmTTK aww ,^ r>M , r B r ^, w ^ wrOTe| ^ itj||| f|fi|a[|w 

°'«™"c 6 ^ C6TTC « AAGITC ^^ 



6100 




WO 98/24810 82/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:57 

fig 54 pi Mi (i > 82851 SHa and Sgguenca 




[ N OP V K M T P N H G 



I H L S F R H L T F S N N 



VERA N G F L 



rrccrr,cc^ 6(iw ^ 6 cTacT A c A6 rc,c«c*=c6A CArcAAraccA,cA» 60aAOA6 cT0CTTCG 6 CTccTCG ACT G c G T A ccc^r, f . T<; . T . 

AAGCAATGCACTCCTCCTTCCACCArCTCAGTCrGTCCCT GTACTTACCGrTGTrcCTT CTCCACCAACCCCACCAGCTGACCCATGfinTTr/SArjyfA * ^00 



VBVLRRKLVES 



-OHFplMI 



*>. S 0 ' " * " t E E L L h V L 0 W V P K L 



rCArCTCCACACCrTCCT,GAGAAGCACAGCACCTCAGACTT CCTCArCG C CCCTTGCTTCTrTCTGrCGTG T CCCATTGGCATTGAG G A C rrrrn. t r, 



AG T4 G AGG TG TC G A AG G A AC TC T T CC TG TC GTG GAG T C 



TGAAGGAGTAGCCGGGAACGAAQAAAGACAGCACAGGGTAACCGTAACTCCTGAAfifif r 



6800 




-OflFpLMl 



- L - ' " L 1 ' M * r S 0 r t- ■ 0 P C F F C S C P | 6 , e Q , „ T 

rGGTrCATrGACCrCTGGAACAACTCTATCATTCCCTATCT A CACGAAGGAGCCAAGGATGGGAIAAAGGTCCATGGACAOAAAG CTGCTTGnGA^r 
AUAAGTAACTGGACA C CTTGT,GAGA,AGTAAGGGATAGAr 6 TCCTTCCTCGGTTCCrACC CTATTrCCAGGTACCTGTCTTTC C Ar r .A, f r, WT ^ ^ 




' ' ' - C V " S S ! ' > ' <■ » ' ' U ° « » < V H G 0 « A A V E 0 
C*GTGG,AtGGGtCCGGGACACACirCC^ 

°^CCTTACCCAGGCCC I G I? TGAAGGG^ ggl ,000 




7100 



ASP P E 0 R T 



-OBP pLMl 



V " " $ T S S 5 «• » S 0 P U W A M L , t Q e 

AACT.CA: T ^ cr c I ccAGA r cOAGAAACCATCCrGGACCCCAACC I rCAGGCAACACrTTAAGGGrxCG G CAArCACrGrCACCCCCGGACAGCAGAAC 
^^jATGiA^vTCAGAGGrCTAGCTCrTTGGrAGGACCrGGGfi rTGGAAG ^CGGTTGTGAAAT TCCCAAGCCGlTAGTGACAGTGGGGGCCTnrrnTrTTn 7200 



" * I E S P 0 B 



-ORFpLMI 



E r ' L 0 P N L 0 A T L 
GCTGGCATCAGCTATCTTAGCTCCTCCTCrCCCCTCTrrTrTT Trail in 



GFGNHCHPR 



r A £ 




«SP A PGGEOEGGG 



Pag* V 



ACCAA,CAOCCTGTAAAAATGACACCCAACCATGGCTrGC ACrTGAGCTTCAGGATGrrGACCrTCrCCAACAACGTGGAGCCAGC CAAr C r.r, Tr . T .^ 
fGGTrAGTCGGACATTTTrACTGTGGGTTGGTACCGAA CGTGAACTCGAAGTCCrAC AACTGGAAGAGGTrGTTGCACCTCGGrCGGTTACCGAAG GACC^ WC0 



WO 98/24810 



83/270 



PCT/EP97/06956 



Tuesday. 18 November 1997 13:57 
t«qS4pLM1 H >828S) Site and Seouenc* 



Page 1 



AGSAGGGACACGTTCTTCCTCCTGTACCITTGAGAACTTCC 



TCCTCCCTGTCCAAGAACCACGACATGGAAACTCTTGAAGGATCC TTCC T 



TAGGAAGGAATGGTGGGGTGGCGTTTGGGAACTTGTGCCCCCrAAACACATTTACTGCC 



7«00 



^ C T C S VCCTFENFLGft 



N G G VAFGNLCPLNTF T 



c rcc rc taatgac tttggggaaaaga 



GASGAGATTACTGAAACCCCTTTTCTAC 



tgattctgggtctttcccttgacttcttgtttcaattacaaactcctgggctttctgggg. 



AGGGGTTCAGAAAA 




7500 



L V G X 0 0 5 



S F P 



LLVSITNSVAFVGG 



V 0 K 



CATC AAAACACTGCA GCAGTrcCCCGGAATTCAGCTTGGACTTAACCAGGCTGAACTTGCTCAAAAGAAGCCGAArTCCAGCACACTGGCG GCCGTTACT 
GrAGTTTTGTGACGTCGTCAAGGGGCCTTAAGTCGAACCTGAATTGGTCCGACTTGAACGAGTTTTCTTCGGCTTAAGGrCG TGTGACCG CCGGCAATGA 78 °° 



TSKHCSSSPE 



FSLDL Tft LNL tKaSRIPAHVRPL U 
AGTTCTAGAGCGGCCGCCACCGCGGTGGAGCTC CAATTCGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCGTCGTTTTACA ACGTCGrGACTGG 

^agatctcgccggccgtggcgccacctcgaggttaagcgggatatcactcagcataaIgcgcgcgagIgaccggcagcaaa^ 7700 



V V L 0 



R 0 V 



V > E R P p p R w s S H S P Y S £ S Y Y A P S L A 

GAAAACCCTGGCGTrACCCAACTTAATCGCC TTGCAGCACArCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGA GGCCCGCACCGATCGCCCTTCCCAAC 

CrTTTGGGACCGCAATGGGTTGAATrAGCCGAACGTCGrGTAGGGGGAAAGCGGrCGA CCGCATTATCGCTTCTCCGGGCGTGGCTAGCGGGAAGGGTTG ^ 
g N P 6 y r 0 U N P L A A H P P f A S y R h S t t A » T D Q p S Q 

agttgcgcagcctgaatggcgaatgggacgcg ccctctagcggcgcattaagcgcggcgggtgtggtggttacgcgcag cgtgaccgctacacttgccag 

TCA,CGCGrCGGACTTAC C GCTTACCCTGCGCGGCACArCGCCGCGTAATrCGCGCCGCCCA C ACCACCAATGCGCCTCGCACrGGCGATGTGAACGGK ^ 

olrsl wgevda p csg a lsaagvvvtr s v t a t l as 

CGCCCfAGCGCCCGCTCCTTTCGCTTTCTTCCCTTC CTTTC ^CGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCT AAATCGGGGGCTCCCTTTAGGGTTC 

gcgggatcgcgggcgaggaaagcgaaagaagggaaggaaacagcggtgcaagcggccgaIaggggcagttcgagatttagccccccagggaaatccc^^ boco 

■ " ' A ' A P F A F F » S r T F « C F «> « 0 A L N R G L P U G F 



CGATTrAGTGCrTrACGGCACCTCGACCCCAAAAAACrrGATTAGGGTGATGGTTCA C GTAGrGGGCCATC G Cr C Tr, A r fl . A rn. TTT 



GCTAAATCACGAAArGCCGTGGACCTGGGGTTTTTTGAACTAArCCCACTACCAAGTGCATCACCCGGMGCGCGACrArCTGCCAAA 

GDGSR 5GP S P . . r V F R p L 



TTCGCCCTTTGA 
AAAAGCGGGAAACT 



aioo 



C07 7GGAGTCCACG rrcrTTAAfAGTGGAC T CTTGTTCCAAAC FGGAACAACACTCAACCCTATCTCG GTCTATTC 
GCAACCTCAGGTGCAAGAAArrATCACCTGAGAACAAGGTT rGACCTTGTTGTGAGTTGGGATAGAGCCAGi 



T L t 5 T F F 



* S G LLFOTGTT 



TTTTGATTTATAAGGGA TTTTGCC 
ATAAGAAAACTAAAT ATTCCC taaaacgg 



LNPISvySFOL 



ntCGCCTATTGGT r AA AA AA TGAGC TG A f T 



G I L P 



TAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGCrTACAATTTAG 



"-^«^M»TTm*CTC e *CT^TOTnHMATT0C«CTr«»Af;OTTTUT«TF««*TCTMMli 
' 5 * I W L " " « ■■ ' • " « F N » NFNKILTLTl 



8285 



WO 98/24810 84/270 PCT/EP97/06956 



Tuesday, 18 November 1997 13:57 of 
f:g55pCB251 (1 >8197) Site end Sequence gQ 
Enzymes : All 148 enzymes (No Filter) 

Settings: Linear. Certain Sites Only. Standard Genetic Code 

GACGGATCGGGAGATC TCCCGATCCCCTATGGTCGACTC TCAGTACAATC TGCTCTGATGCCGCATAGTTAAGCCAGTA TC TGC TCCCTCCTTGTGTCTT 

CTGCCTAGCCCTCTAGAGGGCTAGGGGArACCAGCTGAGAGTCATGTTAGACGAGACTACGGCGTATCAArTCGGTCATAGACGAGCGACGAACACACAA W 
, T 0 R , E 1S RS PMVQS QYN LL.CR I V K P VSAPCLCV 

GGAGGTCGCTGAGTAGTGCGCGAGC AAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGT TAGG 

CCTCCAGCGACTCATCACGCGCTCGTTTTAAATTCGATGTTGTTCCGTTCCGAACTCGCTGTTAACGTACTTCTTAGACGAATCCCAATCCGCAAAACGC ^ 
GGR , V vR E QN LSY NKARLDR Q LHEESA.G.AFC 

CTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGArTATTGACrAGTTATTAATA GTAATCAATTACGGGGTCATTAGTTCATAGCCCATATM 
GACGAAGCGCTACATGCCCGGTCTATArGCGCAACTGTAACTAATAACTGATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTArAT ^ 
AA SRCTG Q IY ALTL1 1D .LI IV1NYG V I S S . P I V 

TGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATA ATGACGTATGTTCCCATAGT 
ACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGGCG6GTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCA ^ 
G V P ft Y I T Y G K V P AWL T A Q ft P p p 1 DVNNDVCSHS 

AACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTAC6C CC 
TTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTGATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGCrTCATGCGGG *°° 
NAN RDFPL TS MGG LFTVWCP LG S TSSVSYAKYA 

CCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAG TACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGT ATTAGTCA 
GGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGT ^' 
P Y . ' R 0 • R . ; M A , R I A L C P V H D L M G L S Y L A V H L R I S H 

TCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATC AATGGGCGTGGATAGCGGTTTG ACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAA 

AGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTT ™ 
* Y Y H G D A V L A V H Q V A V t A V I T G I S K S P P H . R Q 

TGGGAGTTTGT TTTGGCACCAAAATC AACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGrAGGCGTGTACGGTG 
ACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTTACAGCATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACArGCCACCCT: 
V E F V L A P K S T . G L S K M S . Q L R P I D A N G R . A C T V G 

G ^ G ^ A ^ A ^* AG CAGAGCTC TCTGGCTAACTAGAGAACCC ACTGCTTAC TGGCTTATCGAAAT TAATACGAC TC ACTATAGGGAGACCCAAGC TGGCTAGC 
CAGATATATTCGTC ^CGAGAGACCGATTGATCTCTTGGGTGACGAATGACCGAATAGCTTTAATTATGC TGAG TGATATCCCTCTGGGTTCGACCGATC3 ^ 

I > , 

1 T7 promote* prmmq siie —I 

G 1 Y K ° S S L * N ■ » T H C L L A Y R N . y Q s L . G D P S V L A 

gtttaaacttaag cttaccatggggggttctcatcatcatcatcatcatggtatggctagcatgactggtggacagcaaatgggtcgggatctgtacga: 

CAAATriGAATTCGAATGGTACCCCCCAAGAGTAGTAGTAGTAGTAGTACCATACCGATCGTACTGACCACCTGTCGTTTACCCAGCCCTAGACATGCTa ^ 

— , 



-ProBonaoinang domain JJ 



fKLKLTMGGSHHH HHHGMASMTGGQ OHGRDL YD 

GArGACGATAAGG TACCCGGATCCTTCCGAGACCCCACG3ACGATGTTCACGGCTCAGTGCTGTCCCT3GCCTCCAGTGCCrCCTCCACC TACTCC TCA3 

CTACTGCTATTC CATGGGC CTAGGAAGGCTCTGGGGrGCeTG CTACAAGlGCCGAGTCACGACAGGGACCGGAGGTCACGGAGGA SS Tnr.Arf.Af.r.A,-.I-' ' ' * 
. > I 



DDK VPGSFR QP T Dp VHGSVLSLASSASS7 



S o 



WO 98/24810 85/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:57 Paqe x 

fig 55 pCB251 (1 > 8197) Site and Sequence 

CTGAGGAGAGGATGCAATCTGAGCAAATCCGGAAGCTTCGTAGGGAAC TGCAATCATC CCAGGAAAAAGTGGCCACCTTGACGTC TCAGCTTTC TGCCAA 
GAC TCCTCTCCTACGTTAGAC TCGTTTAGGCCTTCGAAGCATCCCTTGACC TTAGTAGGGTCCTTTTTCACCGGTGGAAC TGC AGAG TCGAAAGACGGTT ' 



■pCB25l insert = U2 



•U2QRF 



AEERMOSEQ ! RKLRRELES5QEKVAT 



LTSOLSAN 



rGCTAATCTGGTGGCTGCTTTTGAGCAGAGCCTGGTGAATATGACATCCCGCCTGCGACACCTGGCAGAGACGGCCGA GGAGAAGGACACTGAGCTGCTG 

acgattagaccaccgacgaaaactcgtctcggaccacttatactgtagggcggacgct gtggaccgtctctgccggctcctcttcctgtgactcgacga;- ' Jw " 

r pCB251 insert = U2 



-U2 0RF 



A N I V A A F E Q S L V N M T S R L R H L A E TAEEKOTELL 

gatttgcgagaaaccatagactttctgaagaaaaagaac tctgaggcccaggcagtcattc agggagcccttaatgcctcagaaaccacacccaaagaac 
ctaaacgctctttggtatctgaaagacttctttttcttgagactccgggtccgtcagtaagtccctcgggaattacggagtctttggtgtgggtttcttg 



-pC8251 inserts U2 



•U2 ORF 



0 L R , E T 1 0 F > * K K N S E A Q A V j Q G A I NASET TPKE 

ttcggatcaagagacaaaactcctc agatagcatctcaagcctcaacagcatcac tagccattccagcatcggcagca gcaaggatgctgatgcgaaaaa 

AAGCCrAGTTCTCTGTTTTGAGGAGTCTATCGTAGAGTTCGGAGTTGTCGTAGTGATCGGTAAGGTCGTAGCCGTCGTCGTTCCTACGACTACGCTTTTT 




K R Q N , S S °S!SSLNSlTSHSSIGSSKOAOAK» 



GAAGAAAAAAAAGAGTTGGGTCTATGAGCTTCGAAGTTCCTTCAACAAAGCGTTCAGTATAAAAAAGGGGCCCAAGTCAGCTTCCTCATACTCGGA rATM 
CrTCTTTTTTTTCTCAACCCAGATACTCGAAGCrrCA AGGA AGTTGTrTCGCAAGTCA TATTTTTTCCCCGGGTTCAGTCGAAGGAGTATGAGCCTATA: 

^pCB25l inserts U2 



-U2 ORF 



K < K . K 5 V V Y E L R S SF M K A F $ f K KGPKSASSYSDI 
GA3GAGATTGCTACACCCGACTCTTCAGCCCCCTCATCCCCCAAACTACAGCATGGTTCTACAGAGACTGCTTCACCCTCCATCAAGTCCTCCACCTTGT 



CTCCTCTAACGATGTGGGCrGAGAAGTCGGGGGAGTAGGGGGTTTGATGTCGTACCAAGATGTCTCTGACGAAGTGGGAGGTAGTTCAGGAGGTGGAACA 
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CCTCCGTGGGCACrGATGTCACCGAGGGCCCTGCTCACCCAGCCCCCCACACTAGGCTGTTCCATGCAAArGAG GAGGAGGAGCCAGAGAAGAA3GAGoT 
GGAGGCACCCGTGACTACAGTGGCrCCCGGGACGAGTGGGTCGGGGGGTGTGATCCGACAAGGT ACGTTTACTCCTCCTCCTCGGTCTC7TCTTCCTCCA 

-pCB251 inserts U2 




S S V G T 0 V T E G P A H P A P H T R L FHANEEEEPEKKE 



A ^CGGAGCTGCGCTCTGAGCTATGGGAGAAGGAAATGAAGCTTACAGACATCCGCTTGGAGGCCCTCAACTCTGCCCACCAAC TGGATCAGCTTCGGGAG 
TAGCCTCGACGCGAGACTCGATACCCTCTTCCTTTACTTCGAATGTCTGTAGGCGAACCTCCGGGAGTTGAGACGGGTG GTTGACCTAGTCGAAGCCCTC 

pCB251 inserts U2 




S £ L R S E L V E K E M K LTOIRLEALNSAHQLOQLRE 



ACCATGCACAACATGCAGTTGGAGGTGGACCTGCTGAAAGCAGAGAATGACCGACTGAAGGTAGC CCCAGGCCCCTCATCAGGCTCCACTCCAGGGCAGu 
TGGrACGTGTTGTACGTCAACCTCCACCTGGACGACTTTCGTCTCTTACTGGCTGACTTCCATCGGGGTCCGGGGAGTAGTCCGAGGT GAGGTCCCGTCC 

-PCB251 insert = U2 



20a 




T M H N M Q L £ V D L L K A E N D R L K VAPGPSSGSTPGQ 

TCCCrGGATCATCTGCATTATCTTCCCCACGCCGCTCCCTAGGCCTGGCACTCACCCATTCCTTCGGCCCCAG TCTTGCAGACACAGACC TGTCACCCA7 
AGGGACCTAGTAGACGTAATAGAAGGGGTGCGGCGAGGGATCCGGACCGTGAGTGGGTAAGGAAGCCGGGGTCAGAACGTCTGTGTCTGGACAGTGGGTA 



-pCB251 insert = U2 



-U20RF 



V P G S S A L S S P R R S L G L A L T H S F G P S L A D T D L S P M 

GGATGGCATCAG TACTTGTGGTCCAAAGGAGGAAG TGACCC TCCGGGTGG TGG TCAGGATGCCCCCGCAGC AC ATCATCAAAGGGGACTTQAA GCAGCAvi 
CC rACCGrAGTCArGAACACCAGGTTTCCTCCTTC AC TGGGAGGCCCACCACCAC TCC TACGGGGGCGTCGTG TAGTAGTTTCCCCTCAAC TTCGTCGTC 




° G S T C G P K £ £ V T L R V V V R |i P P Q HI 1KG0LKQQ 

GAATTCTTCCTGGGCTGTAGCAAGGTCAGTGGAAAAGTTGACTGGAAGATGCTGGATGAAGCTGTTTTCCAAGTGTTCA AGGACTATATTTCTAAAATGG 
CTTAAGAAGGACCCGACATCGTTCCAGrCACCTTTTCAACTGACCTTCTACGACCTACTTCGAC AAAAGGTTCACAAGTTCCTGArATAAAGATTTTACC 

^8251 insert » U 2 

' U2QRF - 

> G C S K V S G KVDVKMLOEAyFQvFKOV I 5k ft 
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ACCCAGCCTCTACCCTGGGAC TAAGCAC TGAGTCCATCCATGGCTACAGCATCAGCCACG rGAAACGAGTGTTGGATGCAGAGCCC CCCGAGATGCCTCC 
TGGGTCGGAGATGGGACCC TGATTCGTGACTCAGGTAGG TACCGATGTCGTAG TCGGTGCAC TTTGC TC AC AACCTACGTCTCGG3GGGCTC TACGGAGG 



-pCB2Sl insert a U2 



-U2 0RF 



D P A S T L G L S T E S I H G Y S i S H V KPWLOAEPPENPP 



TTGCCGTCGAGGTGTCAATAACATATCAGTCTCCCTCAAAGGTCTGAAGGAGAAATGCGTCGACAGCCTG GTGTTCGAGACGCTGATCCCCAAGCC,^ 

aacggcagctccacagttattgtatagtcagagggagtttccagacttcctctttacgcagctgtcggaccacaagctctgcgactaggggttcggcta!i 2t 



GATG 



-pCB2S1 insert o U2 



-U2 OBF 



crrgvnnisvslkglkekc v oslvfetlipkpm 

atgcagcactacataagcctcctgctgaagcaccggcgcctcgtcctctcgggccccagcggcacgggcaagacctacctg accaatcgcttggccga^ 
tacgtcgtgatgtattcggaggacgacttcgtggccgcggagcaggagagcccggggtcgccgtgcccgttctggatggactggttagcgaaccggctca 26C ' : 



-pCB251 inserts U2 



-U20RF 



" 0 H . Y < S L L L K H R R L V L SGPSG T Q K TYL TNRLAE 



acctggtggagcgctctggccgtgaggtcacagagggcatcgtcagcaccttcaacatgcaccagc agtcttgcaaggatctgca actgtatctttccaa 
tggaccacctcgcgagaccggcactccagtgtctcccgtagcagtcgtggaagttgtacgtggtcgtcagaacgttcctagacgttgacatagaaaggtt " 7a 



-pCB251 insert =i U2 



-U2 ORF 



V L V E * S G « E V T £ G 'VSTFMH HQQ3CICPLQLYL $ N 
CC TAGCCAACCAGATAGACCGGG AAACAGGAATTGGGGA fGTGCCCCTGG rGATTCTATTGGATGACCTGAGTGAAGCAGGC TCC ATCAGTGAG TTGG K 

ggatcggttggtctatc tggccctttgtcc ttaacccc t acacggggaccactaagataacc tactggactcact tcg tccgagg ta gtcac rc aacca!- 

^pCB251 insert = U2 




-U2 0RF 



L A N . Q ' D .« E T C I G D V P L V 1LL0PL SEA CS I S E L V 

AATGGGGCCCTCACCTGCAAGTATCATAAATGTCCCrATATTATAGGrACCACCAATCAGCCTGTAAAAATGACACCCAACCATGGCTTGCA CTTGAGCT 
rTACCCCGGGAGTGGACGTTCATAGTATTTACAGGGATATAATATCCATGGTGGTTAGTCGGACATTTTTACTGTGGGTTGGrACCGAACGTGAACTCGA 



2*X 



-pCB25l insert = U2 



1 " U2QRF 

W G A > T C K Y H K , C P Y | I G f T N 0 P V K M f p N H G L H L 3 
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rCAGGATSTTGACCTTCTCCAACAACGTGGAGCCAGCCAATGGCTTCC TGGTTCGTTACCTGAGGAGGAAGCTGG TAGAG TC AGACAGCGACAfCAATGC 
AGTCC TACAACTGGAAGAGGTTGTTGCACCTCGG TCGGTTACCGAAGGACCAAGCAA TGGAC TCC TCCTTCGACCATC TC AG TC rGTCC CTGTAGTTACli 

-pCB251 insert a U2 



-U2QRF 



F R M L T F S N N V £ p A N G F L V R Y L ft ft K L V £ S D SO IMA 
C AACAAGGAAGAGC TGC TTCGGGTGCTCGACTGGGTACCCAAGC TGTGGTATCATCTCCACACCTTCCTTGAGAAGCACAGC AC CTCAGACTTCCTCaTC 

gttgttccttctcgacgaagcccacgagctgacccatgggttcgacaccatagtagaggtgtggaaggaac tcttcgtgtcg tggagtctgaaggagtag 



-pCB251 inserts U2 



-U2 0RF 



N * E - E L L R V L P V v p , < L V Y H L H T F L S K H S T 5 0 F L ! 
gccccttgcttctttctgtcgtgtc^ TC T AC AGGA AG 

CCGGGAACCAAGAAAGACAGCACAGGGTAACCGTAACTC^ 




GPC FFLSCPl GtEDFR T V F I D L V N N S I ! P Y L 0 £ 



GAGCCAAGGATGGGATAAAGGTCCATGGACAGAAAGC TGCTTGGGAGGACCCAGTGGAAfGGGTCCGGGAC ACACTTCCCTGGCCA TCAGCCCAAC AAGA 
CTCGGTTCCTAC CCTATTTCCAGGrACCTGTCTTTCGACGAACCCTCCTGGGTCACCTTACCCAGGCCCTG TG TGAAGGGACCGG TAGTCGGGTTGTTCT 

-PCB2S1 inserts U2 



-U2 0RF 



° A - * ° G 1 V H G .0 < A A V £ p p y £ V V R Q T L P W P S A Q Q 0 

CCAATCAAAGCT GTACCACCTGCCCCCACCCACCGTGGGCCCTCACAGCArTGCCTCACCTCCCGAGGATAGGACAGTCAAAGACAGCACCC CAAGTrCT 
GGTTAGT TTCGACATGGTGGACGGGGGTGGGTGGCACCCGGGAGTGTCGTAACGGAGTGGAGGGCrCCTATCCTGTCAGTTTCTGTCGTGGGGrTC AAGA 



-pCB251 inserts U2 



-U2 0RF 



° S K 1 Y H > p P P T V G P H S i A S P P E 0 R T V K D 3 T P S 3 

C TGGACTCAGAT CCTC TGATGGCCATGC TGCTGAAACTTCAAGAAGCTGCCAACTACA TTGAGTCTCCAGATCGAGAAACCATCC TGGACCCCAACCTTC 
GACC TGAGTCTAGGAGACTACCGGrACGACGACTTTGAAGTTCTTCGACGGTTGATGTAAC TCAGAGGTC rAGCTCTTTGGTAGGACCTGGGGTTGGAAG ^ 



-pCB25l insert U2 



* " — U2 opc 

L ° S ° P L " A " I. I * L Q E AAMYICSPORETILDPNL 
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tig 55 pCB251 (1 > 8197) Site and Sequence 

AGGCAAC AC TTTAAGCGTTCGGCAATCACTGTCACCCCCGGACAGCAGAAC6C TGGCATC AGCTATCTTAGCTCCTCCTC TCCCC TC TCCTC TTTC AGAG 

— i 1 1 ■ — i ■ > ■ i • i.i. 1 ■ — — . — . 1- yy,,: 

TCCG TTG TGAAATTCCC AAGCCG TTAGTGACAGTGGGGGCCTGTCG TC TTGCGACCG TAG TCGATAGAATCGAGGAGGAG AGGGG AG AGGAG AAAG TC TC 



-pCB25l insert = U2 



0. 



02ORP 

Q * T L . G F G N H C H P R T A E ft V H Q LS.LLLSPLLFQ3 

CAC TGGCTCTCCAGCCCCAGGAGGAGAACAGGAGGGAGGAGGAGATGAAAGAGGAGGGACAGGTTCTTGGTGCTGTACCTTTGAGAACTTCCTAGGAAGG 

' 1 ' ' ' 1 1 1 — 1 ' 1 1 1 ' 1 ' i ■ i t i ■ — • — i . . ■ 37Ci " 

GTGACCGAGAGGTCGGGGTCCTCCTCTTGTCCTCCCTCCTCCTCTACTTTCTCCTCCCTGTCCAAGAACCACGACATGGAAACTCTTGAAGGATCCTTCC 



~pCB251 insert = U2 



T G S P A P G G £ Q E G G G D E R G G TGSVCCTFENFLGP 

A ATGGTGGGGTGGCGTTTGGGAACTTGTGC CCCCTAAACACATTTACTGGCCTCCTC TAG AGG6CCCGTTTAAACCCGCTG A TCAGCCTCG AC TGTGCCT 
~****^ '" ' ' ' " ' ' * ' ' ' ■ ■ ■■ I i. . ■ « . - ■ t . ■ .i i i . i i ... i i | | | ^£{V 

TTACCACCCCACCGCAAACCCTTGAACACGGGGGATTTG TGTAAATGACCGGAGGAGATCTCCCGGGCAAATT TGGGCGACTAGTCGGAGCTGACACGGA " 

PCB251 inserts U2 1 

NGG VAFCNLC P LN TF TGL L RARLNPL I SLDCA 

TCTAGTTGCCAGCCATCTGTTGTTTGCCC CTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTG 

' ' ' 1 * . — ., .4,— i i | ■»—.,... | .,, . t „ . , . ( ... , . , t . „ . ( „ , ; mt , t "|,).'Y 

AGATCAACGGTCGGTAGACAACAAACGGGGAGGGGGCACGGAAGGAAC TGGGACC TTCCACGGTGAGGG TG AC AGGAAAGGATTATTTTAC TCC TT TAAC 
F LPA IC .CLP LPRAFLD P G RCHSHCPFL I K GNC 

CATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGG ATTGGGAAGACAATAGCAGGCATGCTGGGGA 
G TAGCGTAACAGAC TCATCCACAGTAA5AT AAGACCCCCCACCCCACCCCGTCCTGTCGTTCCCCCTCCTAACCCTTCTG TTATCGTCCGTACGACCCC t 
{ A L , S E v S F Y S G G V G GAGQOGGGLGRQ 0 A C V G 

TGZGG FGGGC TCTATGGCTTC TGAGGC3GAAAGAACC AGCTGGGGC TC TAGGGGGTATCCCCACGCGCCCTGT AGCGGCCCAT TAAGCGCGGCGGGTG TG 
ACCCC ACCCGAGATACCGAAGACTCCGCCTTTCT7GG TCGACCCCGAGATCCCCCATAGGGGTGCGCGGGACA TCGCCGCGTAATTCGCGCCGCCC ACAC "'^ 
C 5G L YGF . GG K NQ LGL . G V S P P A L .RR IKPGGC 

GTGGTTACGCGCAGCGTGACCGCTAC AC TTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTC TTCCCTTCCT TTCTCGCCACGTTCGCCC- GCT TTCCCC 
CACCAArGCGCGTCGCACTGGCGATGTuAACGGTCGCGGGATCGCGGGCGAGGAAAGCGAAAGAAGGGAAGGAAAGAGCGGTGCAAGCGGCCGAAAGGG- 
GGYAQR P R yrCQRPSAR SFRFLPFLSRHVR3USP 

G TC A AGC TC TAAATCGGGGCATCCCT TTAGGGTTCCGAT TTAGTGC TT TACGGCACC TCGACCCC AAAAAACT TGA TTAGGGTGATGGTTC ACGTAGTGC- 
C A3 TTCGAGATT TAGCCCCGTAGGGAAATCCCAAGGC TAAATCACGAAATGCCGTGGAGC TGGGGTTTTTTGAACTAATCCCACTACCAAS TGC ATCACC 
3 . S S , ^SG HPFRvp [ .CFTAPRPQKT.LG.WFT.V 

GCC ATCGCCCTGATAGACGGTTTTTCGCCC TTTGACGTTGGAGTCCACGTTCTTTAATAGTGGAC TCTTG T TCC AAAC TGGAACAACAC TC AACCC TATC 
CGC TAGC GGGAC TATC TGCCAAAAAGC3GGAAACTGC AACCTCAGG TGC AAGAAATT ATCACC TGAGAACAAGGT TTGACC TTG TTG TG ACT TGGG AT AG 
AIA L 'QGFSP FOV GVHVL . . V T L V P N V N N T 0 P t 

''-'^ A ^|CTTTTGATT TATAAGGGATTTTGGGGAT T TCGGCCTATTGG TTAAAAAATGAGCTGA T TTAAC AAAAA TTTAACGCGAA T TAA f TC r<;TC- 
AGCCAGA f A AG AAA AC TAAATAT TCCC TAAAACCCC T AAAGCCGGA TAACC AATTTTTTAC TCGAC TAAATTG TT T TTAAAT FGCGC TTAA " 7AAGAC AC 
L 3 L F F ■ F ' * OFGOFGLLVKK AQLTK t . R E 1 L •* 



WO 98/24810 90/270 PCT/EP97/06956 
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gaatgtgtgtcagttaggg tgtggaaagtccccaggc TCCCCAGGCAGGC AGAAGTATGCAAAGCATGCATCTCAATTAGTCAGC AACC AGGTGTGGAAA 

— -. — » ■ ■ i i i ■ i i t ii t \ ■ i ■ ■ i . . i , «-■ , » .... t i , . i i , ■ , i i. i wr5C\ 

CTTACAC ACAGTCAATCCCACACCTTTCAGGGGTCCGAGGGGTCCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAG TCGTTGG TCCACACCTT7 

NVCQLGCGKSPGSPGRQKYAKHASQLVSNOVVh 

G TCCCCAGGC TCCCCAGCAGGCAGAAGTATGCAAAGCATGCArCTCAATTAGTCAGC AACCATAGTCCCGCCCCTAAC TCCGCCCATCCCGCCCCTAAC T 

. ■ i i < ■ ■ i t i * ■ t i ■ i ■ i i i » i c:7C«: 

CAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGGGGATTGAGGCGGGTAGGGCGGGGATTGA 

VPRLPSRQKYAKHASQLVSNHSPAPNSAHPAPN 
CCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGAC TAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCTGCCTC TGAGCTATTCCAGAAG TAGT 

i . * 1 * 1 1 ■ i > i • »- aao: 

GGCGGGTCAAGGCGGGTAAGAGGCGGGGTACCGACTGATTAAAAAAAATAAATACGTCTCCGGCTCCGGCGGAGACGGAGACTCGATAAGGTCTTCATCh 
SAQFRPFSAPVL TNFFYLCRGRGRLCL A I P E V V 

T 1 ■ ■ ii - ' ' ' - ' ' ' - - ' 

GAGGAGGCTTTTTT6GAGGCCTAGGCTTTTGCAAAAAGCTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATCAAGAGACAGGATGAGGATCGTTTCG 

, , , 1 ■ > ■ i 1 i. i i i i 1 ■ i £9QC 

CTCCTCCGAAAAAACCTCCGGATCCGAAAACGTTTTTCGAGGGCCCTCGAACATATAGGTAAAAGCCTAGACTAGTTCTCTGTCCTACTCCTAGCAAAGC 

RRLFVRPRLLOKAPGSLY IHFR ( S R 0 R M R I V S 

CATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTC TGAT 

i ■ i . ■ . i 1 . 1 i 1 ■ i . 1 1 1 • + 500C 

GTACTAACTTGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTA 

HO.TRVIARRFSGRLGGEAIRL.LGTTDNRLL. 
GCCGCCGTGTTCCGGCTG7CAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGC AGG ACGAGGC AGCGCGGC 

i . 1 . 1 1 . 1 . i •- *ioc 

CGGCGGCACAAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTCCTGCTCCGTCGCGCCG 
CRRVPAVSAGAPGSFCODRPVRCPE TAGRGSAA 

TATCGTGGCTGGCC ACGACGGGCGTTCCTTGCGCAGCTGTGC TCGACGTTGTCAC TGAAGCGGGAAGGGAC TGGCTGCTATTGGGCGAAGTGCCGGGGCA 

• 1 ■ » ■ > > t ■ i i 1 i < ■ i ' 1 ■ ' * ■ ■ ■ » 1- 520\ 

ATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGACTTCGCCCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGT 

I VAGHDGRSLftSCARRCH .SGKGLAA I GRSAGA 

GGATC tc c TG TC ATCTC AC CTTGCTCCTGCCG AG A AAGT at ccatcatggctgatgcaatgcggcggctgc at acgcttgatccggctacc tgccc attc 
, 1 , , , , 1 , 1 • , , , 1 1 . , . — — + ^jc-: 

CCTAGAGGACAGTAGAGTGGAACGAGGACGGCTCTTTCATAGGTAGTACCGACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAj 

gspv ispcscres ihhg.cnaaaaya. s g y l p ! 
gaccaccaagcgaaacatcgcatcgagcgagcacgtactcggatggaagccggtcttgtcgatcaggatgatc tggacgaagagcatcaggggctcgcgc 

i t > i i \ i i i i i i iii< i i i .ii 5vC<.' 

ctggtggttcgctttgtagcgtagctcgctcgtgcatgagcctaccttcggccagaacagctagtcctactagacctgcttctcgtagtccccgagcgcg 
rppsetshrastysdgsrscrsg.sgrrasgara 

cagccgaactgttcgccaggctcaaggcgcgcatgcccgacggcgaggatctcgtcgtgacccatggcgatgcctgcttgccgaatatcatggtggaaaa 
gtcggcttgacaagcggtccgagttccgcgcgtacgggctgccgctcc tagagcagcactgggtaccgctacggacgaacggcttatagtaccacctttt 
srtvroaogaharrrgsrrdpvrcllaeyhgg* 

tggccgcttttctggattcatcgactgtggccggctgggtgtggcggaccgctatcaggacatagcgttggctacccgtgatattgctgaagagcttgg: 
accggcgaaaagacctaagtagctgacaccggccgacccacaccgcctggcgatagtcctgtatcgcaaccgatgggcactataacgacttctcgaaccg 
vplfwihrlvpagcggplsghsvgyp yc RAV/ 

ggcgaatgggctgaccgcttcctcgtgctttacggtatcgccgctcccgattcgcagcgcatcgccttctatcgccttcttgacgagttcttctgagcgg 

1 i i .i i 1 i i i 1 i i i i i i > '.^C 1 ' 

CCGC T fACCCGAC tggcgaaggagc acgaaatgcc atagcggcgagggct aagcgtcgcg TAGCGG AAG AT agcggaagaac tgc tcaagaagactcgcc 

RRMG. PLPRALRYRRSRFAAHRLLSPS R V L L S '» 
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GAC TC TGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCC TTCTAT GAAAGGTTGGGC TTCGGAA TC 
CTGAGACCCCAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTA3 ^ 
r LG FEM TQQATPN LPS ROF DSTAAFYERLGFG I 

GTTTrCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGC TGGAGTTCTTCGCCCACCCCAACTTGTTTATTGC AGCTTATAATGGTTACA 

1 '' '' * ' ' | . t , , j, , . i „ , , _ j f " *>0(*-i* 

CAAAAGGCCCTGCGGCCGACCTACTAG6AGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGGTTGAACAAATAACGTCGAATATTACCAATGT 
VFQ . 0AGVMtLQRG , OLML£ PFAHPNLF 1 AAYNGY 

AATAAAGCAATAGCATCAC AAATTTCACAA ATAAAGCATTTTTTTCAC TGCATTC TAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTG 

' ' * "* * ' '"" ' *"*" " * ' I . i ■ . | ■ ■■ ' I * i | ... i i ■ i > ■ . i )....» | i, t . . , , t , , , t «Q(Y 

TTATTTCGTTATCGTAGTGTTTAAAGTGTT TATTTCGTAAAAAAAG TGACGTAAGATCAACACCAAACAGGTTTGAGTAGTTACATA6AATAGTAC AG AC 
K • SNS 1 T NFT NKAFF SL HSS CGL SKL INV5YHVC 

TATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATT CCACACAACATACGAGCC 
ATATGGCAGCTG6AGATCGATCTCGAACCGCATTAGTACCAGTATCGACAAAGGACACACTTTAACAATAGGCGAGTGTTAAGGTGTGTTGTATGCTCGG 
j p s T S S . S L A . 5 V S . L F PV.NCYPLTIPHNIRA 

GGAAGCATAAAGT6TAAAGCCTGGGGTGCCTAATGAGT GAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGT 

' 1 i 1 ■ ■ * ' ■ ' ■ ■ l * ' * 1 * 1 > ■ ■ » ■ [ - ■ i ■ t - i > i | t | 320C 

CCTTCGTATTTCACATTTCGGACCCCACGGATTACTCACTCGATTGAGTGTAATTAACGCAACGCGAGTGACGGGCGAAAGGTCAGCCCTTTGGACAGCA 
G S I K C K A V G A . .V S. LTL1A LRSLPAFQSGNLS 

GCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCAC TGACTCGCTGCGC TCGGTCGT 
CGGTCGACGTAATTACTTAGCCGGTTGC6CGCCCCTCTCCGCCAAACGCATAACCCGCGAGAAGGCGAAGGAGCGAGTGACTGAGCGACGCGAGCCAGCA 
CQ , LH ' • I GQR AGRGGLR I GRSSASSLTOSLRSVV 

TCGGCTGCGGCGA6CGGTATCAGCTCAC TCAAAGGC5GTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAG AAC ATG TGAGCAAAAGGCCAG 

AGCCGACGCCGCTCGCCATAG tcgag tgagtttccgccattatgccaataggtgtcttagtcccctattgcgtcctttcttgtacac tcgttttccggtc ° Ua 

R L R R A V S A H S K A V 1 R L S T E S G P N A GK NM.AKGO 

CAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAG^ _ 
GTTTTCCGGTCCTTGGCATTTTTCCGGCGCAACGACCGC AAAAAGGTATCCGAGGCGGGGGGAC TGCTCGTAG TGTTTTTAGCTGCGAGTTCAGTCTCCA °° U 
OKA RNRKKAA LLA FF HRL R PPOEHHKNRRSSQP 

GGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAA6C TCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACC fG TI 
CCGCTTTGGGCTGTCCTGATATTTCTArGGTCCGCAAAGGGGGACCTTCGAGGGAGCACGCGAGAGGACAAGGCTGGGACGGCGAATGGCCrATGGACAG 
V R N P T G L . R Y Q A F P P G S S L VRSPVP TLPLTGYL3 

CGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGG CTGTGTGCAC 
GCGGAAAGAGGGAAGCCCTTCGCACCGCGAAAGAGTTACGAGTGCGACATCCATAGAGTCAAGCCACATCCAGCAAGCGAGGTTCGACCCGACACACGTG 
A FL . PSG SVALSQC SRC RY LSSV.VVRSKL6CVH 

GAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCC3GTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGC AGCAGCCACT3 
CTTGGGGGGCAAGTCGGGCTGGCGACGCGGAATAGGCCATTGATAGCAGAACTCAGGTTGGGCCATTCTGTGCTGAATAGCGGTGACCGTCGTCGGTGAC ^ 
E P P V Q P 0 R C A L S G H Y R L E 5 N P V R H Q L S P L A A A T 

GTAAC AGGATTAGCAGAGCGAGG tatgtaggcggtgc tacagagttcttgaagtggtggcctaactacggc tacac T AGAAGG AC AG TATTTGG TATC T2 
CATTGTCCTAATCGTC TCGCTCC AT AC ATCCGCCAC3 ATGTC TCAAGAAC TTCACCACCGGA TTGATGCCG ATGTGATCT TCC TG TCAT^AACCATAGAC °' M 
Gw *?SRA_RV VGGArEPLKVUPrJYGVTRRTVFSIC 
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CGCTC TGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATC CGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAo 
GCGAGACGACTTCGGTCAATGGAAGCCTTTTTCTCAACCATCGAGAACTAGGCCGTTTGTTTGGTGGCGACCATCGCCACCAAAAAAACAAACGTTCGTC 
A L L K P V T F G K R V G S S . S GKQTTAGSGGFFVCKQ 

CAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACG CTCAG TGGAACGAAAAC TC ACGT TAAGGGATT TTGG 

GTCTAATGCGCGTCTTTTTTTCCTAGAGTTCTTCTAGGAAACTAGAAAAGArGCCCCAGACTGCGAGTCACCTTGCTTTTGAGTGCAArTCCCTAAAACC 
0 I T R R K K G SQEOPL IFSTGSDAQVNENSR. GIL 



TCATGAGATTATCAAAAAGGATCTTCACCTAGA TCCTTTrAAATTAAAAArGAAGTTTTAAArCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAo 

- , . „— |, ■ „ ..«„.,. , ■ t ■ , ., i-... ■ , t . i ,,| ■ ■ I iiit . „ ,.,. | , 4, — , n ) . . , , t , . „ . , 4 l , fr. , , . , , , , t 

AGTACTCTAATAGTTTTTCCTAGAAGTGGATCTAGGAAAATTTAATTTTTACTTCAAAATTTAGTTAGATTTCATATATACTCATrTGAACCAGACTGT^ 

VMRLSKRIFT, ILLN.K.SFKSI.SIYE.TWSDS 
■ ■ ...iii .I i.i i i i ■ - ■ ■ * . . — ' • i r i 

TTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATC TGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGG 

' ' ' 1 1 ' * ' * "* * " " " 1 * 1 1 1 1 1 — — 1 ■ * > ■ i i ■»,.. ■ i i i ■ i i | ( | , t 73<y 

AATGGTTACGAATTAGTCACTCCGTGGATAGAGTCGCTAGACAGATAAAGCAAGTAGGTATCAACGGAC TGAGGGGCAGCACATCTATTGATGC TATGCC 

Y Q C L I S E A P I S A I C L F R S S 1 VA.LPVV. ITTIR 

GAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACC AGCCAGCCGGAAGGGCCG 
CTCCCGAATGGTAGACCGGGGTCACGACGTTACTATGGCGCTCTGGGTGCGAGTGGCCGAGG tc taaatag tcgttatttggtcggtcggccttcccggc /UiX 
E G L P S G P 5 A A M I P RO P RSPA PDLSAINOPAGRA 

agcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagag^ 

tcgcgtcttcaccaggacgttgaaataggcggaggtaggtcagataattaacaacggcccttcgatctcattcatcaagcggtcaattatcaaacgcgtt #oU 

ER RSG PA TLS AS IQS IN C CREARVS8SPVNSLRN 

CGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTArGGCTTCATTCA GCTCCGGTTCCCAACGATCAAGGCGAGrTACATGATCC 
GCAACAACGGTAACGATGTCCGTAGCACCACAGTGCGAGCAGCAAACCATACCGAAGTAAGTCGAGGCCAAGGGTTGCTAGTTCCGCTCAATGTACTAGG 

V , V A . * A T G I V V S R S S F GMASFSSGSGRSRRVT.S 

ccc atgt tgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactc atggttatggc AGCACTGC 

GGGTACAACACGTTTTTTCGCCAATCGAGGAAGCCAGGAGGCTAGCAACAGTCTTCATTCAACCGGCGTCACAATAGTGAGTACCAATACCGTCGTGACG 
P M I C K K A V S S F G P P 1 V V R SKL AAVLSL M V M A A L 

ATAArTCrCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCA AGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTG 
TAT TAAG AGAATGACAGTACGGTAGGCATTCTACGAAAAGAC AC TGACCAC TCATGAG TTGG TTCAGTAAGAC TCTTATCACATACGCCGC TGGCTCAAC 
H N S L T V M P S V R C F S V T G E Y S T K S F . E . CMRRPSC 

CTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACG TTCTTCGGGGCGAAAACTCTCAAGG 
GAGAACGGGCCGCAGTTATGCCCTATTATGGCGCGGTGTATCGTCTTGAAATTTTCACGAGTAGTAACCTTTTGCAAGAAGCCCCGCTTTTGAGAGTTCv; 
5 C P , A SIR0NTAPHSRTLKV U1 IGKRSSGRKISR 



3C< 



ATCTrACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAA 
TAGAATGGCGACAACTCTAGGTCAAGCTACATTGGGTGAGCACGTGGGTTGACTAGAAGTCGTAGAAAATGAAAGTGGTCGCAAAGACCCACTCGTTTTT 
1 1 P , L L R SSSM.PTRAPN.SSASFTFTSVSG.AK 



C AGGAAGGCAAAATGCCGC AAAAAAGGGAATAAGGGCGACACGGAAAToTTGAATACTCATACTCTTCCTTTTTC AATATTATTGAAGCA TT TATC AGGG 
G TCC T TCCGT TTTACGGCGTTTTTTCCC TTATTCCCGC T3TGCCTTTACAACT TATGAGTATGAGAAGGAAAAAGT TATAATAAC TTCG TAAATAG TCCC 
T G R QN A A *C K G I R A T R K C . I L 1 L F L F 0 Y Y . 3 I Y 0 0 
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.TTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTC 

' 1 1 1 ' 1 1 ' 1 I ' 1 \ i > ■ t i i i | i » 3 1 0.7 

AATAACAGAGTACTCGCCTATGTATAAACTTACATAAATCTTTTTATTTGTTTATCCCCAAGGCGCGTGTAAAGGGGCrTTTCACGGTGGACTGCAG 
VCLMSGYIFECI K N K Q 1 GVPR TFPRKVPPfJV 
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10 20 40 (jO 70 
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AACCTTGCATGCCTGCAGGAA7TCGATATC AAGCTTATCGATACCGTCGACC.* rCGAGGATC AGAAGAAAT /O 
I GGAGCAACTACCCACATCCA"TATGCCACCCGCGGTTTCTAAGTGAGTTTAATTrTGAGTT I'ACGAC TA 1 00 

CAAAAATGTGT I'C I* f f AA i'AAC I' A ! C I ICGAC I' I'GAG TC I'A ITC TGTATGAC rAGTTGTTG AGTCATT (T ? 10 
I CAT rSAGAAAATATTAAAAGGAACATTATTTACTTTGCTTATTTGCCCTAACTT^'GAT f I'AG FTTTTCr: 200 

A rCAACTAGATCTTACAAAACTTGCAATACAATTCCATTTTCAGATTACCC TCGCi.' ACGTGTCGCC ACGT 360 

3*> 370 380 390 400 410 '120 

» ■ i i » t i t 1 ■ ■ « i t ...» 1 .... i .... i .... i .... i i . . . i . . . . i .... i .. . , i .... i , , , , i 

CAGCAACC(-.CTTCAGCAACTAACCCAAA~TCCAACTTTCCACAAATGTCAACATCt:ACiGr r rCAGACTCC WO 
AC AGrCAAGAATATCGAAAATTGGTAAGAATTTTATTTTGAGCTCAAAC fTG TATAAAATCCCCAG AAAA 490 
GAAGA I SA I AAAAA l*G I'AG T rTTTT T GiT!AAAAC7TCCACCTTTATTGCTC7AATA"*GACGGCT7ATATCT GtiO 
CAATTTTCTTGAGT TTTATCAAAAAATTTTCC ACTATACAAATGTAGAAAAGTA I"' f fGCACAAATTTTG 030 
ICAGTTGACAGCrTTGTAATAGATCCAAArGGAACCrAGATACAAGCrGTTAAAG'-GGAAGCACCGCAAG 700 
710 720 730 740 750 '/CO I /() 

1 1 ' ' I » 1 ' ' I ' ' * ' t I I I t I > I I I I 1 L I 1 1 1 1 I I 1 I I t I I I i I t t I ■.. I .... I .... I .... I .... I _ 

TO I'A ' ACTGGAAATAATGATCTGAAAC AAATTTGTGCTATTCTCAAA I'G f T rAAG/\CATGTTTTGAAGA T 770 
TTTTTCAAATTCGCAC rA(iTTTCAGAACCT T CCTTTTTGTATGAAAAAGTAAAAA«AAACTATTTCAAAr 
CCTCACCGCCACCArGTTTCAACTCTTAATTTTTATAAAATTTTGCAATTTACAAMl'CGCC rCCCCT~GC 9") 
CCGAAAAGTGCCCACCAAAAfCAArrrC r C GGC TTC AT AA TG ACT T TT A A AT TG A" GTG AG AAA AC AC AG 3*0 

a a^ ag^pta ap t aaa r rfi Af Ar.r.r; a r a cc-T7G rrr r rc rrrrrrr TrrrrpTCCC^CC fCCTCCTTCCCiT 10W 
1 060 1 070 1 080 1 090 1 1 00 H 1 0 MM 

' * ' » 1 ' ■ ' ' * ' ' ' ■ I i i i ■ i » I 1 I I t I I I I I 1 I | 1 . I 1 I I I I I I I I . , . I . , , , I , , , , I , , uj t , . , . I 

TCCATCTCCAACAACAACAAIT TC^AAT r TCCTTGTCCATTTTGCTTATAAACA" rrGTGTGTOCAAGG I 120 
AAACTACACGGGGAGACGGTCAA^'AA-tcgAATGAoAGCATGGCAATTAC rcr r CGGAA A T TG A TG AA 1 MJO 
I AAAGATAGAGCCGATGACACTGGCTGGTAGTAGTATGAG TG fAGAATTGCTTTT'CATCGTCTCAAC I I' 1280 
GCGCA I'GAG fCT TCCCCCGCTCTCATCAC7CACAAT7AA7GTCGGG I" V rTATGCGC: T CTTTCCTATTCCG 1330 
CCAC rCATTCTGGGTTACC ACAAAC TSGAA I'ACA I' r T TAG rACTATTCAAGCC AT'TATTT TGA fA T TTA !40C 
1^10 I4?0 i '130 1440 14G0 1480 m/0 

4,1, 1 i t iil i i fci ttiiii.^.^l t,,xl.*>>t>.t.l , . .. I .... I .... I 

ATTTTGTGCAATTAGGGATAAACACliACTT TTAAAAGTTTATTTAAAAAAACGAT;, IT TTCGATTTTAAA IU70 
AAATCTGAAAAG rrrCAAAAAATCAA-AAATATTCCCTAACAAATrGTATGGCTAAAATTTTArrrC TAC J SW 
TG TTGAC AA IA TC f T TATATG T ATC AC TG T TTTCC ATC TC AAAAC C r TGAATCCCC CAAGT TA TAGCiAAG 16:0 
C7CCGTC TCACATTTCCCArGCTATG-ATCGCTAC TCAGCACATATCCAAAAATTaAGCTAGACSG T i'GA 1 680 
I AATTATTGGGCACGCGTAATAAAG rGCAAGCAGTTAGAATTTTAAT T CAAGCACAGATTATC TA TCAAA I7S0 
1760 1770 1780 1790 11)00 1GI0 1820 

1 1 ' » 1 1 ' L ^ I ' ■ ' I ' ' 1 1 I I t I 1 1 I I f I I » I I I I I i I I I I i t I I l I I I I I t I I 1 t-L_l_L .... 1 .... I 

rTCAArCTTTGAACATiCAGCCAGTTCGVACAArf rTCCATGCTTTTTCGCCCAT^ AAAAAaCTT7CTc:a ifPO 
CC rCTTCATCCATCTCACTCG I'A fC AT AAAAAG7ATACCAAAAGC0CG AC rc TAG \ TTTTAAGAG AAGGA 133.1 
uATACTGAGCCACA rGGCGTGTGACCCrTTTCArC rCG rCCGT TCGGTCTCAAATICACW: rc AT ACT A A 19S0 
CTC:T7CAAArAGCCATAGACCTCCTTC"TTC7TCT"GTTTTGACTCGCGCC I'A I 7TTTTGTGGC TGCC 2030 
rG <X AAGCCGGGAAAA rTTAGTATAT~TA~G AGCTTA r CTT TATGCAATAC ATAAA/ AACGAGGCAATT7A 2100 

?1?0 2130 21'K) 2150 2:r)o 2170 

t I I I I ■ I I I I U..L.L I I I I I I l ,.. I .... I .... I .... I .... I .... I .... I . . . , I . .. . I .... | 




CT AGA rCTTAOTTGCCC ATAAGCTC AAGCCCAACC AGAAA TGACTTGC ATTTAuT \ I AAGCCT AG^TTGA 2^iW 
C T TGC T »*GCTTC AGTC T AA '*^CAGAC* f AlVA 77TCC AAGAG AG rT' r TCAATTTTAA/ TGI f fCC AT.TT7C f 2450 
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2460 2470 2980 2490 2600 26 tC 2520 
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TfiTTACTTAAAATCTTAATGCCC !'G rGATGCGTAAAATCGTTATCCCTTTCTCTCACACTTTCAATTACA 2520 
CA TTCA PCAAAGA 1' PGGTATCAAGCCAAAG ACGTCTGGAC P TAAACCACCC 'CATCA1CAACCAC \ I'CAI 2590 
CAAATAATACAAATTCATTCCGTCCGTCGAGCCG I" PCGAG TGGCAATAATAATGTTGGCTCGACOATaTC 2t3(50 
CACATCTCCCAAGAGCTTA3G TATCCGATCCTTCCGGCTTCTTTT7AGAAAT TA TA \ PA X X IXACAA CCA 2730 
TCATCAACGrACAGCTCTATTTCGAArcrAAACCGACCTACCTCCCAACTCCAAAAACCTTCTAGACCAC 2S00 
2810 2820 2030 28<10 2850 2880 2870 

-«-»''* ■ I I l I I » ' I I i i « i t i t i i I i i i t I i i i i 1 iiiiliititittitii.il . . . , I , . . . I « ... I 

AAACCCAGC TAG TTCGTGTTGCTACAACTACA AAAATCGGAAGC rCAAAGC I'AGAGGA PCCCCGGG A I ! G 2870 

GCCAAAGGACCCAAAGGIAIGI I ICGAAI6A! AC! AACAI AACAI AGAACAI I I ICAUGAUCSACCC i I K\K\ 2940 

AGGGTACCGGTAGAAAAAATGAGTAAAGGAGAAGAACTTT PCAC I'GGAG X tQ IXCCAA I PC I' PG ! PGAA i 30 10 

TAGATGGTGATGTTAATG3GCACAAATTTTCTG7c;AGTGGAGAGGG PGAAGG PGA PGCAAG A PACGGAAA DOHO 

ACTTACCCTTAAATTTATTTGC ACTACTGG AAAAC TACCTGTTCCATGGGTAAGTTTAAAC ATATATATA 3 I GO 

3 ISO 3170 3100 3190 3200 3210 3220 

t t i i 1 . 1 i i i 1 i i i i I i * t j 1 i i i i I i t i i [ i i i j 1 i i i i Lt i i i I i i i i I t i i i Li i i i t i.ut.t.Luuui-L 
CTAACTAACCCTGATTATTTAAATTTTCAGCCAACACTTGrCACTACTTTCTGTTATGGTGrrCAATncr 3220 
•C rCGAGATACCCAGA TCA PA PGAAACGGC ATGACTTTTTCAAGAGTGCCATGCCCGAAGo TTATGTACA :*2<J0 
GGAAAGAAC TATATTTTTCAAAGATGACGGGAACTACAAGACACG I'AAG X PTAAACAGT CCGG I'AC PA AC n:jGO 
i AACCA PACA TAT X PAAA7 X X TCAGG TGC PGAAGTCAAGTTTGAAGGTGATACCCT TGTTA ATAGAATCG 3430 
AG TTAAAAGG TA I'TGAT r F T AA AG A AG ATG G A AAC ATTCT TGG AC AC AAA TTGCiAATACAAC TATA AC TC 3600 

3610 3620 3630 3640 3650 3580 3670 

i-Jl 1 I 1 I t I 1 I I I I l I l l I I I l I t I 1 I 1 | 1 1 l | l | I | » I I 1 * 1 t 1 1 I I I .« — t I I 1 uLl .UL.I I. 

AC ACAATGT4TACA7CAT3GCAGACAAACAAAAGAATGGAA fCAAAG X PG PAAG X X PAAAC P fGGAC r PA 3570 
C ! AAC TAACGGA : PA PA f X f AAA X X " f CAGAACTTCAAAATTAGACACAACATTGAAGATGGAAGCGTTC 3640 
AACrAGCAGACCATTATCAACAAAA'ACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTA 3710 
CCTGTCCACACAATCTGCCCTr rCGAAAGA ; C CCA AC G AAAAC AG A G ACC AC ATGG TC C TT C T TGA G T 77 3/80 
G7AACAGCTGCTGGGATTACAC ATGGC ATGGATGAAC I'A P AC AAA PAGCAT FCGTAG AATTCCAAC TGAG 3860 

3860 3870 3800 3890 3900 3910 3920 

1 * 1 ' 1 I . 1 ■ I I I I I I 1 . . . I . . t. 1 I .11.1 .... I ■ ■ . , 1 I 1 I I I .... I .... 1 ...it, ♦ ■ . I . . . . ) 

CGCCGGTCGCTACCA ITACCAAC P I'G PCTGGTGTC AAAAATAATAGGGGC CGCTGTC ATCAG AG7A AG T7 3^20 
T A A ACT G AG T T C T AC 7 A AC T A ACG AG 7 A A TAT TTAAA X X X 1V.A6CAVC PCGCGCCCG PGCC TC PGAC X I'C 3990 
MAC, PCCA.MI'AC 10 ri'CAACATCCCTAGATGCTCTTTCTCCCT'iTGCTCC^ACJCrCC PAP PP P rCTfAT 'I0<50 
TATCAAAAAAACTTCTTCTTAATT^CTTTG fT X T X PAGCTTC~TTTAAGTCACCTCTAACAATGAAAT r G 4100 
fGT AGATTCAAAAA TAG AA T TAATTCGTAATAAAAAGTCGAAAAAAATTGTGC rCCCTCCCCCCATTAAT 4200 
4210 4220 "230 4240 4250 4280 4270 

1 * 1 ' I 1 * I I I » * * » 1 i i » i I i i t i I i i i i I i i i i I | i j y \ j . | i I | . i | I , f ! I t . . . t . ■ i i 1 i i i t I 

a a taattctatcccaaaa rc pacacaatgttctgtgtacacttct TATcr r r r r it iacttc pgat aaat 42 /o 

X X P P P IGAAACATCATAGAAAAAACCGCACACAAAA rACCTTATCATATGTTACGTTTCAGTTTATGAC 43'I0 
C'iCAATTTTTA"TTCTTCGCACG7C rGGGCCTCTCATGACGTCAAATCATGC i'CAI'CG f'GAAAAAGTTTT 44 10 
SG^GTATTTTTOGAAiT I I i'CAATC AAGTG AAAGTTTATGAAATT AATTTTCCTGC TTTTGCT TTTTGGG 4480 
fi:i"TTCCCCTATTGT7TGTCAAGAG X f rCGAGGACGGCG T TTTTCTTGCTAAAA PC ACAAU I'A I" I'G A TGA 'iSfO 

4660 46/0 4580 4690 4600 4610 462C 
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GCACGATGCAAGAAACA TCGGAAGAAGGTTTGGGTTTGAGGCTCAGTGGAAGG I'G AG PAG A AG TTGATAA WJQ 
TTTGAAAGTGGAGTAGTG r CTATGGGGTTT~TGCCTTAAATGACAGAA PACAT TCCCAATATACCAAACA 'iGyO 
! AAC i'G I I I CC PAC*"AGTCGGC CGTACGG5CCCT FTCGTC TCGCGCGTTTCGGTGA TGACGGTGAAAACC M /60 
TC T G AC AC A T GC AG C TC C C G G A G AC G G fCACAoCT TGTCTG'AAGCGG A PGCCGGC ACCAC. AC AAGCCCG 4830 

ti:agggcgcgtcagcggg pg r pgcc cggtg t r n gg gc. tgg c r r a a c t a tg c g c c a t c ag ag c a g A* r tg pa 49cc 
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4910 4920 4930 4940 4Q60 4960 '1970 
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C'GAGAGTGCACCA TA TGCGG TG fGAAATACCGCACAGATGCGTAAGGAGAAAATACOGCATCAGGCGGC 4VJ/0 
C i'AAGGGCC i*CG I'GA ; ACGCC TATTTTaTAGGT TAATGTCATGATAATAATGGTTTCTTAGACGTCAG SO 40 
G r GGCAC T T r rCGGGGAAATCTGCGCGGAACCCCTATTTGTTTATTTTTC TAAA rACATTC AAATATfiTA f>t :0 
rCCGCTCATGAGACAArAACCCTGATAAATGwTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAA *i \«) 
OA f TTCCGTG TCGCCC I IA V ICCC f f fTT"GCGGC ATTTT6CCTTCCTGTTTTTGCTC ACCCAGAAACGO fi'iuO 

5260 5270 5280 5290 5300 53 10 5320 
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IGG TGAAAG TAAAAGA TGC r GAAGA TCAG r I GGG I* GCACGAGTGGGTTACA TCGAAC 1'GGA I 0 I CAACAG 5320 
CG G T AAG ATCCTTGAGAG TTTTCGC CCCGAAG AAC GTTTTCC AATGATGAGCACTT TTAAAG T TCTGC TA *390 
TG TGGCGCGGTAT TATCCCCTATTCi ACGCCGGGCAAGAGCAACTCGGTCGCCGCATAC ACTATTC7CAGA KU6C 
ATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGrAAGAGAATTATG 6630 
GAG I'GC I'GCCATAACCA I GAG I'GA AACAC TG CGG CCA AC T7ACTTCTGACAACGA7CGGAGGACCG A AG £600 

5610 £620 KB3C) bb40 b6bO b660 h« /O 
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G.AGC fAACCGCTTT FTTGCACAACA fGGGGGATCA FG T AACfCGCCTTGA TCGTTGGGAACCGGAGC TCA 6670 
a:"oAAGCCATACCAAACGACGAGCGTGACACCACGATGCC ^G TAGCAA r GGCAACAACG rTGCGCAAACT 5740 
A* fAACTGliCGAAC TACTTACTC T AGCTTCCCGGC AAC AAT PA A f AGACTGGATGfi AGGCGGA7AAAG TT 5610 
CC AGC.ACCACTTCTGCGCTCGGCCC TTCCGGCTGGCTGGTTTATTGCTGATAAATC TGGAGCCGGTGAGC bSBO 
:j i GGG rc TCGCGG rATC AT TGC AG C AC rCGGGCCAGA IGG PAAGCCC f'CCCG TA IX G I AG f TA IX r AC AC G9GC 

6960 69/0 5960 5990 8C00 60 1 0 (5020 
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G A ::: G G G G AG T C A G G C A A C TA TG G A 7 G A AC G A A A T A G AC AG A TC GC TG AG A TAG G TG C C TC AC TG ATT A AG 6020 
CA I l*GG I'AAC TG TCAGACCAAG r i' VAC I'CA ! 'A IATACTTTAGATTGATT7AAAAC T TC ATTTT TAA I T ! A 60CO 
AAAGGA I'CrAGG I GAAGA l*CC rTTTTGATAATCTCATGACCAAAATCCC T TAACGTGACiTTTTCCTTCCA bKSC 
CTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCT T GAGATCCTTTTTTTCTGCGCGTAATCTGC 6?30 
T GCTTCCAAACAAAAAAAf.C ACCGCTACCAGCGGTGGTTTG7TTGCC3GATCAAGAGC TACCAAC TC T T T 6300 
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" 'CC 6 AAGG ( AAC VGGCTTC AGC AG AGCGC AGATACCA AATAC TG T CC TTCTAG I'G rAGCCfi I'Afi f i'AGG 6370 
C C A C C AC T T C A AG A AC T C TG TAG C A C C GC C T A C A T ACC TCGCTC I'GC TAA IX C 'G T TACCAGTGGCTGCT BU«0 
GC C A GTG C C GAT A AG TC G TG TC T 7 A C I" G Ci G r I'GGAC fC A A G AC G A 7 AG 7T A C C GC» A T A ACi G C G C AC C G G T 66:0 
CCGGCTGAACGGGGGGTTCGTSCACA::AGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGArACCr 65^0 
AC A G C G T G A GC A T TG AG A A A GC G C C AC GC T TC C C G AAG GG AG A A A G G C GG AC AGG T A T CCG (i f AAGCGoC 3650 

63RC 6670 6660 6G90 67X 6/10 b/20 
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AGGGTCGGAACAGGAGAGC6CACGA jGGAGCTTCCACGGGGAAACGCC rGGTAICT XT ATAGTCCTGTCC 6720 
GG r r TCGCC ACCTCTGACTTCAGCo r CCjA ."VT7TGTGA TGCTCGTCAGGGGGGCGGA^CCTATGGAAAAA 6 /«) 
CGCCAGCMCGCGGCCTTTTTACGG TCC7GGCCTTTTGC rGG CC T T T TG C TC AC A TG T TC T T TCC T G CG 6060 
TTATCCCCTGAT TfJ r« J GGA rAACCG T A7TAC--.GCC P I' rGAGTGAGCTGA TACCGC TCGCC SCAGCCGAA 6900 
C7 ACCGAGC6CAGCGAG7CAG I'GAGCG AGvlAAGCGGAAGAGCGCCCAATACGC AAA(XGCC TC7CCCCGC 7000 

7010 7020 7030 /040 7060 7060 7070 
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GCC7T0OCCGATTCA f r AAT AO AOC TOGO ACQ ^CAGCTTTCCCC AC rCCAAAGCGC GCACTGACCGCAAC 70 70 
GC AAT T AATG f*6A(i I" I'AGCTCACTC A^'AGGC ACCCCAGGCTTTACAC T !" PATGC7 TCCGGCTCGTA7GT 7 1 '10 
i'G I G TGGAATTGTGAGCGGA r AAC A & T T~C AC AC A GG A AAC AG C F ATGACCATGA'" TACGCCA AGC TG TA K> K; 
AG 7TTAAAC ATGA7C 7TAC I'AAC TAAC 7ATTCTCA T X I' AA AT7TTC AGAGCTT AAA AAFGGC PGAAA fCA /2SC 

ct:acaacga7ggatacc,ctaacaa: r : ggaaatgaaat ''319 
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An-iACCAIGAM ACGCCAA^CTTGCATGCCTGCACiGAATTCGATATCAAGCTTATCGATACCGTCGACCT JO 

ii AG C A T C AG A A G A A AT TGG AC C A AC "I" AC*. C C AC A 7 C C A f f'A I'GCCACCCGCGG I' T I'C I'AAG TGAu TT TAA 1 <IO 
TT 7 7GAG TT FACGAC rACAAAA ATGTG~TCTTTAATAACTATCT T CGACTTGAGTC TATTCTGTATGACT ';u> 
AG TTGTTGAGTGATT7TTCATTGAGAAAATATTAAAAGGAAC A f"A f f TACT TTGC T TATTTGCCCTAAC 280 
T-TTGATTTAGTT7TTCGATCAACTAGA""CTTACAAAACTTGCAATACAATTCCATTTrCAGATTACXCTC 3fiO 
360 370 360 390 WO 4!0 420 
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:CCACGTGTCGCCACGTCAGCAACCGCTTCAGCAAC r A AC CC A AA ITCC AAC I' r FCC AC AAA ( G I'CAACA il?0 
TCCAGGCTTCAGACTCCACAGTCAAGA^TATCGAAAAT fGGTAAGAA!' F 7 TA T F ITGAGC f CAAAC ITG r OGO 
ATAAAArGCCCAGAAAAGAAGATGATAAAAATGTAGTTTTTTTGCAAAACTTCCACCTTTATrGCTCTAA 660 
7A7GACGGC7TATA7C7CAATTTTCT7::A3TTT7A TC A AAAA AT F T TCCAC r A TAC AAA FO I AGAAAAG T 630 
A- ! I I GC AC AAA I J' I' KG I'C AG F FGACAGC 77TGTAATAGATCC AAATOOAACCTAG ATACAAGC T GTTAA AJO 

710 720 730 7U0 7C0 7C0 770 
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AG i G G A A GG AG C G C A AG I'C ; A TAG I GGAAA IAATGATC7GAAACAAATTTGTGCTATTCTC A AA7GTTTA //() 
-^ACATGTrTTGAAGA^TrrrrCAAATTCGCACTAGTTTCA^AACCTTCCTrTTTGTATGAAAAAGTAAA MO 
* A A A A AC T A T T T C A A AC C C TC AC C G C C AC C A T G TT FCAAC TCT7AATTTT 7 AT AAA A7TT7GC£ ATT T AC ttIO 
*A£ i'CGCC I'CCCC 7 r GCCCGAAAAG* r GCCCACCAAAATCAA7TTC TCGGC 7TCATA A 7GAC T T r FAAA 7 f 930 
.'A i'G <"G A G A A A A C AC AC A AG AG G C T A A .1 7 A A A T T£ AC A GG G A C AG G 7 7 3 T C C C T C J' IC TCCC ICC * TC I'C 1050 
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:CGCCTCCTCCTCCCG r rCCAfC TCCAACAACAACAATTTTCCAATTTCGTTGTCCA r TT TGC r I"A rA AA 1 l?0 
JA I I I'G I'G rG '"GGAAGUAAACTACACS 3G3 AGACGGTCAA T7AATTCGAA rGAGAGOA I'GGCAA.TAC I'C 1 190 

'ttc.ggaaattgatgaataaagataca:x::ga rCACACTGGCTGGTAGTAGTATGAGTGTAG \ATTGCTT I2G0 
TTTCATCGTCTCAACT T GCGCA FGAG "... r I CCCCCGCTCTCATCACTGAC AATTAATGTCGGG ^TTTATG 1330 
■lOCTCTTTCCTAT TCCGCCACTC AT C7GGG7TACCACAAACTGG AA ~AC A I i' I' I'AC TACT A T I'CAAGCC 1400 

mo ia20 iu30 mo mco i^ioo vi70 
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AT TTATTTTCATaV T TTAA f I" T I '^ VGCA^^T AGGGATAA AC ACGAC TT r TAAAAGTT TA TTTAAAAAAACG 14/0 

atattttcgatt^taaaaaa igaaaagtttcaaaaaa~caa taaa"a r rccc i aacaaattgtatggc io<*o 

! AAAATl I :'Af 1' TCTAC TGTTCACAA~A7CTT TA TA I G I'ATCACTGTTTTCC ATCTCAAAACC TTGAA r C 1610 
C! CCAAGTTaXTAGGAAGCTCCG TGTCACATTTCCCATGCTArGAA I'CGC f ACTCAGCACATATCCAAAAA 1600 
T7 AAGCTAGACGGTTGATAAT FA I' I'GG Ci C A C G ?. G T A A T A A AG T GC A AG C A C I' TAG A AT TTTAATTCAAGC 1 76C 

1 760 1 770 ' 700 1 /90 1 80C 1810 1 8?0 
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a:;aGA7TATCT>;TCAAATTCAA"C I* r f GAACATTCACCCAGTTCG 7ACAA r I i I CCATGCT7TTTGGCCC 1320 
ATTAAAAAACTT rc TCACC TC7 "CATCCATCCACTCG l*A ) C A 7 A A A A AG TAT AGO A A A AG C C C G A C IC I 1600 
AC 7TTTT AAGAGAAGGAGATACGAGCCACA 'GGCG 1 G TG ACCCT T T7CA7C7CGTCCG I' TCGG I'C TC AA VJ(3(; 
A T T C AC G C T C A T A C I A AC TC T TC A A A T AGC C A 7 «iG AC C TC C TT G T TT T C T fC 7 7C G T T T TG A C TCG C G CC ?X)20 

•v. r r r r r ig tgg c t g c c t g a a a c j c c gg g a a a a 7 t t ag t at a t t r a tg ag c r i'atctttatgcaataca7a 2 :co 

51 SO '2i20 2:30 21UH ?|fj() 2W0 2170 
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AAAAACGAGGCAAI ■ rAAAAATA T 7AAAA7 TAA 1GAGGT7G7AGA 7G TAG A I I I'GCi AAAAGAAGAAAAAA SI /O 
A«* Ai AAC AAATAGGAACCGCCAGA TCAAAATTC 7ATT I'AAAGG f r TTC AAGATCTT TAGGC AAGATTCGG WW 
\ ; A AC A G A AA A C TG A AG TG CC ISC A rAAATCTAGTGTAACGTTT AGATTGAACTCGG AAA TCC TAAGCC WO 
T r j i*. ACTA TAGCC T T A f :'C rAGATCT^ATiTTiCGCA I' A A G C TC A AG CCC AAG C A G AA A T G AC TTCCA |" I" I A ,?3<50 
G! * I AAGCC l'AGA77GACT7CC T rGC r rCAGrCTAATCCAGACTAGATTTCCAAGAGACM ! I ICa\AI I I i 2 ! l f 5v 
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AAATCTTTCCAGTTTC r CG T I'AC ITAAAATCTTAATGC CCTGTGA7GCGTAAAATCGT TATCCCTTTCTC 2! WO 
I C AC AC I' I f'CAATTACAGATTC ATCAAAGAT7GGTATCAAGCC AAAGACG TC PGGAC I" I'AAACCACCC I'C 2590 
ATCATCAACCACTTCATCAAATAAi AC AAA !' I CATTCCGTCCGTCGAGCCCTTCGAGTCGC AATAATAAT 26150 
GTTGCCTCCACCATATCCACATCTGCGAAGAGCTTAGGTATCCGATCCTTCCGriCTTCTTTTTAGAAArT '*> /3C) 
ATATTAT7TCAGAATCATCATC AACGTACAGC I'CTATTTCGAATCTAAACCGACCT &CC7CCC AAC Tf.CA 9000 

2610 2820 2630 2840 2850 2860 2870 
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AAAACC r rCI AGACCACAAACCCAGCTASTTCGTGTTGCTACAACTACAAAAATCCGAAGCTCAAAGCTA 20/0 
GCCGCTCCGAAAGCCGTGAGCACCCCAAAACTTGCTTC rGTGAAGAC TAT TGGAGCAAAACAAGAGCCCG 2940 
AT A ACA3 CGGTGG TGGTGG TGGTGG A ATGCTGAAATTAAAGTT AT TCAGTAGCAAAAACCC A rCTTCC TC 30 J 0 
A I' C G A A T AG CC C AC A AC C T AC G A G A A AGSC GCi C GG C GG TG C C T C A AC A AC A A A C T T Tu TC G A A A A FCGC I* DOfiO 
OCCCCAGTG AAA AGTGGCC TG A AGCCCCCG AC CAGTAAGCTGGG A AG7GCCACGTC TATGTCGAAGf T T T 31*0 

3160 3170 3180 3190 3200 3210 3220 
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GTACGCCAAAAGTTTCCTACCG7AAAACGGACGCCCCAA7CATATCTCAACAAGAC TCGAAACGATGCTC MM 
AAAGAGCAGTGAAGAAGAGTCCGGA7ACGCTCCATTCAAC AGCACGTCGCCAACGTCATCATCGACGGAA 3290 
GO TTCCCT AAGCAVGCATfCCACATC^TCCAAGACTTCAACGTCAGACGAAAAGTC TCCGTCATCAG AC G 33G0 
A T C T T AC TC 7T A ACG CC T C C AT C G T G A C A G C T ATC AG A C A G C C G A T AG CC G C A AC A CC GG T T T C TC C. AAA 3430 
r A T T ATC AAC AAGCC TC.T TG AG G A AAA AC C AAC AC PGSCAG I'GAAAGGAG fGAAAAGC ACAGCGAAAAAA 3500 
3610 3520 3530 3540 3550 3560 3570 
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GA FCC AC C TCC AGC TGTTCCGC C AC G' r G AC AC CCAGCC AAC A ATCGGAGTTGT TAG TCCAATTATGGC. AC 36/0 
ATAAG£AGTTGACAAA T GACCCCGTGATATCTGAAAAACC AGAACCTGAAAAGGTCCAATC A A TGAGCAT 3640 
C G A C AC G AC GG A C G T TC C A C CG C T T C C AC C PC r AA A A TC A S T T G T TCC AC T T A A A A TG AC T T C A A T C C. G A 3/10 
CAACCACCAACCi T ACGATGTTC TTC T A AA AC A AGG AAA AATC A CA7CGCC T GTCAAGTCGTTTGGAT A TG 3 /no 
AoCAGTCGTCCGCGTCTGAAGAC fCCA^TGTGGCTCATGCGTCCGCTCAGGTGACTCCGCCG ACAAAAAC 36f;0 

3860 .3870 3880 3800 39C0 :.«J10 MVX) 
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T T C T GG T A A TC A T TC GC T GG AG AG A AG G A T (i G G A A AG A AT AAG AC A TC AG A A T CC A G C GGC T A C AC C TC T 3S2C 
GAC6CCGG TG r I'GCGA i'G IGCGCCAAAATGAGGC.AGAACC TGAAAGAATACGA I'GAC A i'GAC I CG I'CGAG 3090 
CACAGAACGGCTATCCTGACAACTTCGAAGACAGT rCCTCCTTGTCGTCTGGAATATCCGATAACAACGA 406*3 
GC TCGACGACATATCCACGG AC GATT rCTCCGGAGTAG AC ATGGC AAC AG TCGCC ICC A AAC A I'AGCGAC 4130 
TA TTCCCAC TTTGT fCGCCA I'CCCACGTCTTC TTCC7CAAAGCCCCG AG 7CCCCAG CCGGTCC TCCACAT 4200 
4210 4220 4230 4V40 4260 4260 4270 
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C A G T CC A T T C TC G A T C TC G AG C AG A AC AG G AG A A 7 G TG T A C A A AC "TC TG TC C C AG TG CCG A A CG A G C C A 4270 

ACG IGGC GCCGC TGCCACCTCAACC TTCGGAC AACAT FCGCTAAGATCCCCGGCiATAC TCATCCTATTC I" ''340 

CCACAC; r I'A rCAG TG TC AGC TG AT A AGG AC AC AATGTC T A r CC AC TC AC AG AC TAG rCGACGACC MC I l 44 '0 

CACA AAAACCAAGC TAT~CAGGCCAATTTC A r fCACTTGATCG TAAATGCC ACCTTC AAGAG T fCAC A CC 4**flO 

C A C C G AG C A C AG A A T GG C G G C IT. f C TTGAG CC C GA GAC GG G "G CC G AAC rCGATGTCGAAATATGA'TCT a 550 

4560 *570 «SK) ^690 4600 4610 4o?0 
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TCAGGATCCTACTCGGCGCG r I'CSCeAeSTSfiAAflCTCTACTflflTA TV. I'A I'GCAGAGACG 1 ICCAAC ICC Mti^.- 

AC AG AC TAT CCG A I'G AAAAATCCCCCGC AC AT PC TGCC A A A AG T G AG A TG G G A rccCAAC fA TCAC TGGC '1*5130 

TAGC ACG AC AGC ATATGG A TC I C I'C AATG AG A AGTACGAACA TGC TATTCGGG ACA TGGCACG rCACTTG -1760 

G A G T G T T AC A AG A A C AC !'G rCGAC TCAC T AACC AAGAA AC AGGAG AAC TA7GG AGC A T FGT T I'G A PC I i' I' ^030 

I hiAGCAAAAGCTTAGAAAAC:CACTCAA:;ACAf T GATCGATCCAACTTGAAGCC :GAAGAGGCAA TACG 'IQQQ 
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A '' I'CAGGCAGGACATTOCTCATTTGAGGGA fATTAGCAATCATCTTGCATCCAACTCAGCTC/ TCCTAAC 49/0 
GAAGGCGCTGGTGAGCTTCTTCGTCAACC4 PC fCTCOAATCAGTTGCATCCCATCGATCATCG ATG TC AT 5040 
CG r C GTC G A A A A G C A G C A AG C AG G AG A AG A TC AGC TTG AG C PCCJ r T PGGC AAGAAC AAGAAGAGC PGGA I 5 ! 10 
COXTCCTCACTCTCCAAG I* I C ACCAAGAAGAAGAACAAGaVACTACGACGAAGCACAT ATGCCA fCAAT 5180 
rCCGGATCTCAAGGAAC PC I' I'G ACAAC ATT GA TGTGATTG AGTTG AAGCAAG AGC PCAAAGAACGCGA ."A 5250 

6260 5270 5280 5290 5300 5310 5320 
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GTGCACTTTACGAAG rt'CGCC I I'GACAATC TGGATCGT3CCCGCG AAGTTGATG f PC PGAGGGAGACAG f 5*320 

hAALAAla I I liilAAALl.UAUiUU* y\AULAATTAAAGAAAGAAGTGGACAAAC TCACCAACoo rCCAGCCAC I 3390 

CCiTGCTTCTTCCCGCGCCTC AATTCC AGTTATCTACG ACGATG AGC ATGTCTATG A TGC AGC GTG T AGC A f)M60 

GTACATCAGCTAGTCAATCTTCGAAAC3ArCCTCTGGCTGCAACTCAATCAACGTTACTGTAAACGT3GA 5530 

CAICGC I'GGACiAAATCAGTTCGATCGTTAACGGGG ACTTG AAGCAGCAGGAATTCTTCCTGGGCTGTAGC btj<X> 

6610 5620 5630 5640 5650 5660 b6/0 
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AAGG I'CAG PGGAAAAG I' PGACTGGAAGATGCTfiGATGAAGCTGTTTTCCAAGTG T rCAAGGAC PA I A f V P 5670 

CTAAAATGGACCCAGCCTCTACCCTCG3AC TAAGCAC PGAG PCCA I'CC A FGGC TAC AGC ATC AGCC AC G T h/40 

GAAACGAGTGTTGGATGCAGACCCCCCCGAGATGCCrCC rrGCCG I'C GAG G I'G ICAATAAC ATATC AG TC SH10 

TC CCTCAAAGGTCT6AA6GAGAAA f GC 3TC G ACAG CCTGG TGTTC6 AG AC GCTGATCCCC A A6CC6 A I'GA 5680 

r>5C AGC ACT AC A f A AGCC ICC I 6C I'GA AGC ACCGGC6CCTCGTCCTCTCGGGCCCC AGCUCCACGCGC AA 5950 

6960 5970 59HO 5990 6000 6010 6020 
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GACCTACCTGACCAATCGCTTGGCCGAG T ACCTGG f K 6 AG C G C TC T G G C C G TG AG 3 TC AC A G AGG G C A TC 6020 
G fGAGCACC T TCAACA TGCACC AGC AGTCTTGCAAGGATCTGCAACTGTA**"CTT PCCAACC I AGCCAACC 6090 
AGATAGACCGGGAAACAGGAAT TGGGGA I'G P0.CCCCTGCTGATTC7ATTG6ATGACC TGAG ToAAGCAGG 6160 
CTCCATCAG TGAGTTGGTCAATGGGGCCC PCACCTGCAAGTA T CA T AAATGTCCCTA PAT I" A TAGG J'ACC 6230 
/*CCAA TCAGCC I'G I AAA A A TG AC AC CCAACC A TGGCT rGCACi I'G AGC T T C AG GAT G 7 TG A C C TTC T C C A 63C0 

6310 6320 b:*:*! 6340 6350 6360 WA"; 
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ACAACGTGGAGCCAGCCAATGGCT"CCTCG TTCG rTACCTGAGGAGGAAGCTGGTAGAGTC AGACAGCGA 6370 
CATCAATGCCAACAACiGAAGAGCTGCrrCHGGTGCTCCiACTGGGTACCCAAGCTGTGGTATCATCTCCAC 84< ? 0 

M'.CTTCC'TQ^Q *^OC^CAOr.ACC -C^Ot iC TTCCTCATCOOCOOTTCOTTC P •* t O »" O rfiO'CTOCC'.T TO 0*5 1 O 

GC A T TG AGG AC T TC C GG AC C TG G ^ r C A r r G AC C TG TG G A A C A A C T C T A TC A T P C C C PA I'C TAG AGG A AGG Sf'HO 
A3CCAAGGATGGGATAAAGG rcCA I 3GACAGAAAGCTGC7TGGGAGGACCCAG fGG AATGGGTCCGGG AC 6660 

6660 6670 6680 6690 6700 6710 67?C 
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ACACTTCCCTGGCCATCAGCCCAACAAGACCAA I'C AAAGCTGTACCACCTGCCCCC ACCCACCGTGGCCC 8720 

C I CACAGCATTGCCTCACCTCCCGAHGATAGGACAGTCAAAGACAGCACCCCAAGTTCTCTGGAC TCAGA £730 

TCCTCTGATGGCCA P3C IGC PG A A A C T T C A AG A AC5 C PGCC AACTACATTGAGTCTCCAGA I'CG AGAAACC 6860 

A T C C T GG AC CC C AACC T TC AGO C A AC AC T r PAAGGGTTCGCCAATCAC TG PCACCCCCGGaCAGCACAAC 6930 

GC IGGCATCAGCTATCT rAGCTCCTCC T CTCCCCTCTCCTCTTTCAGAGCAC I GGC PCTCCAGCCCCAGG 7000 

7010 7020 7030 /040 7050 /OtSO 7C70 
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AGCAGAACAGCiAGGGAGGAGGAGATGAAAGAGGAGGGACAGG P rc P IGGTGCTGTACCTTTGAGAAC I PC 70 70 

C PAGGAACGAA PGGTGGGGTGGCGT P PGGGAACTTGTGCCCCC T AAACAC ATTTA:! TOCCC TCC TC T AA f / 1 40 

v-ACT T PGGGGAAAAGATGATTC r G G G TC T T T C C C 7 TG AC I PC i' rG T TTCAA r TAC^ AAC 7CC TGCGCT f r U '0 

CrCGGGAGGGGTTCAGAAAA^A^CAAAACACrGCAGCAGTTCCCCGGAATTCAGC^ TGGAC PI AACCAGG /'23C 

CTCAACTTGCTCAAAAGAAGCCGAATTCCAGCACACTGGCCTCCCCATGGTAT TG^ PA PC i GAGC TCCGC 7350 
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A I'CGGCCCiC fGTCATCAGATCGCC ATCTCGCGCCCGTGCCTC rGACT TC f AAG TCC AA f PAC rCTTCAAC 7420 
ATCCCTACATGCTCT rrC rCCC l*G I GC I'CCCACCCCCTATTTTTGTTATTATCAAAAAAAC TTCTTC f*TA ,'490 

a i n c r r re n r r r i agcttcttttaa\gtc acctctaacaatg aaattgtgtagattc aaa aatagaatt /*>«) 

AATrCGrAArAAAAAGTCGAAAAAAATTGTSCTCCCTCCCCCCATTAATAATAATTCTATCCCAAAATCT 

acacaatgt tctg tgtacac rrc i'i a re itttttttacttctgataaattttttttgaaacatoatagaa 7700 

7710 7720 7730 7/40 7760 / /BO ///O 
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AA AACCGCACAC AAAA FACCTTATC ATATG TT ACG TTTCAG F TTA TGACCGC AAT ! rr TATrrCTTCGCA 7770 
C'JTCTGGGCCTCTCATCiACGTCAAATCATGCTCATCGTGAAAAAGTTTTGGAGTATTTTTGGAA^TTTC 78M3 
AA rCAAGTGAAAGTTTATGAAATTAATTTTCCTGCTTTTGCTTTTTGGGGGTTTCCnCTATTGTTTGTCA /i) !() 
AG AG f T t'CGAGGACGGCG f f TTTC r TGCTAAAATCACAAGTATTGATGAGCACGAT3CAA3 AAAGATCGG 7880 
A A G A AGG T T TGG G T 7 TG AGG C TC AG TG G A A GG TG AG PAG A AG f f'GAl'AA T I* I'GAAA.i I GGA'i t AG G i C I' P.OfC 

«o«o ao/o aoao 8090 3100 a no a 120 
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ATGGGGTTTTTGCC7TAAATGACAGAATACATTCCCAATATACCAAACATAACTGTTTCCTACTAGTr;GG Ml*) 
. CCG rACGGGCCCTTTCGTCTCGCGCGFTTCGGTGATGACGGTGAAAACCTCTGACAnATGC AGCTCCCGG !l 190 

agac(iG rcACAGC rrcrc tg r a ag c gg a ro cc g g g agc ag ac a ag c cc g tc a g ggc 3C g tc ag c gg g 1* g t asgo 

TG G C GGG TG TCG G G G C TG GC T T A AC T A TGC GG C AT C AG AG C AG AT TG f AC f GAG AG TGCACCA TA rcCGG 8330 
T'3TGAAATACCGCACAGATGCG T AAGGAGAAAATACCGCA TCAGGCGGCX r rAACiCiriCCrnCi rGAI'ACSC: flftCO 

84 !0 6420 8430 8440 8450 8460 8470 
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CTATTTTTAfAGGTTAATG T CATGA TAATAA~GGT T TC77AGACGTCAGGTGGCAC P T fTCGGGGAAA TG 8470 
TGCGCGGAACCCCTATTTG T TTATTTTTCTAAATACATTC AAA TATG PATCCGC 'C AIGAaACAA I AACC 0540 
C TC A T A A A TGC T T C A A T A AT AT 7 C A A A A AG G A AG A G I* A PG AG T A T TO A AC A T 7 TC C ^ TG TC Ci C C C 7 T A TT 86 10 
C::CTTTTTTGCGGCATTTTGCCT"CCTC;TTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGA T GCTG IW!30 
AAGATCAGTTGGGTGCACGAGTGGGTTACA^CGAACTGGATCTCAACAGCGCTAAGATCCTTGAGACir TT 8750 

13/60 0770 8780 8790 8800 0G1O U820 
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TC G C CC C G A AG A A C 6 TT f ICCAA fGA l'GAGCA0T7~TAAAG7TCTGCTATG TGGCGLIGG fA I" I'A : CCCG i" 002C 
A I ! GACGCCGGGCAAGAGCAACTCGGTCCCCGCATACAC PAf fC fCAGAA TGAC T T3GTTo AG 7ACTCAC 6890 
GAG f CACAGAAAACCATCTTACGGATGGC A PGACAG f AAGAGAATTATGC AG7GC73CCA7 AACCATGAG 8960 
TGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAAr.CGCTTTTTTGCAC 0030 
A AC A TGG GG G A TC A T GT A AC PCGCC I* fGATCGTTGGGAACCGG AGCTGAA fGAAGCCATACCA^ACGACG 3100 
9110 9120 9130 9140 01GO 9IG0 91/0 
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AGCGTGACACCACGA PGCC IG fAGC AiTGGCAACAACG f I* GC G C A A AC T A T T A A C T G G CG A A C TAC f TAG 9 » 70 
i'C .'AGCTTCCCGCCAACAATTAA rAGACTGGATGGAGGCSGA TAAAG I I G C AG G AC C A C T C T GCG C T CG 92'iO 
GCCCTTCCGGCTCGCTGGT rTATTGC~CATAAATC rGGAGCCGGTGAGCGTGGGTC TCGCGG TA PC A l" I'G 9310 
CAGC AC TGGGGC C AG A TGG PAAGCC C~CCCGTATCGTAG P PATCTACACGACGGGG AG r CAGGCAAC TAT 93rtO 
GG A TG A ACG AAA TAG AC AG A PC 3 C PGASAT AGGTGCCTCAC I GA T TAAGC A"TGGTAACTG rCAGACCAA '3450 

9460 9470 94BO 9^90 9500 9510 Ubi!0 
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G I* r TACTCATATATACTTTACA i* !'G ATTTAAAACTTC A I Y I' T TAATTTAAAAGGATC TAGd 'GAAGA I CC 9o/>'J 
MM I -jA rAATC T CATdACC^AA>* rCCCTTAACGTGAGT" 1 " TTCCTCCACCAGCG PCA3ACCCCG I AG A SQSQ 
AAAG A fCAAAG 3ATC TTCT TGAGA I' C C TT T "TT TTC TGC GCG TA AT C TGC T GC T TGC AA AC A A A A AAACC A S660 
CCGCTACCACCGGTG6 r T TGTTTGC CGGA TCAAGAGCTACCAACTCTTT TTCCGAAGG I AAC I GGC ! }CA 5730 
GCAGAGCGCAGA "AC C A A A T AC TG TCC I" !"CTAG T GTAGCC'3 TA'J ' > AGGCCACCAC f TCAAGAAC C fGT 0800 
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AGCACCGCCTACAI'ACC fCGCTCTGCTAATCCTGrTACCAGTGGCTGCTGCCAG7G3CGA7AAG T CGTG T 3670 

C I I ACCGGG TTGGACTCAAGACGATAG T TACCGGATAAGGCGCAGCGGTCGGG(VrG AACGGGGGG T |*CG I SGU^ 

G'^CACAGCCCAGC rTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAG ICO iO 

CGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGrAAGCGGCAGGGTCGSAACAGGAGAGCGC 100M0 

ACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTA TAG I'CCTGTCGGGTTTCGCCACC TCTGACTTG 10l;';0 

10160 10170 10180 10190 10200 10210 10220 
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AGCGTCGAT (TTTG i'GA FGCTCGTCAoGGGGGCGGAGCCTA TGGAAAAACGCCACiC AACGCJGGCC J MM MYA'/M 
ACGG I* ICC I'GGCC TTTTGCTGCCCTTTTGCTCACA re T TC I V rCCTGCGTTATCCCCTGATTCTGTGGAT 10290 
AACCGTATTACCGCC M I C5AG I GAGCfGATACCGCTCGCCGCAGCCGAACGACCGA 3CGCAGCGAGTCAG lUJKC 

tgagcgaggaagcggaagagcscccaatacgcaaaccgcctctccccgcgcgttgg:cgattcattaatg 10^:30 
cagc i'ggcacgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagrtagc i'c 1cs00 

10510 10620 t0530 10540 10550 1C560 106/0 
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ACTCATTAGGCACCCCAGGC n TACA'J i ITArGCTTCCGGCTCGTATGTTGTGTGGAArrG I'GAGCGGA f 100/0 
AACAA T f TCACAC AGGAAACAGC I 1C534 
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ATGACCATGATTACGCCAAGCTTGCArGCCTGCAGGTCGACTc tcujAAATG CAAACCTG rCATTTr TGTG JO 
rATTTCAGCACCGCAGAGAGCACCArAAACAGCTACAACAAGAGCACGAACGrCiiCCTCCA^CAGAAAAA MO 
GC IGCAAAACGC i"GGTACTC ACCACA'GC TCTGATAATTGCCATTTCCTTCGAT'TC I CiACTTTHAA CTG 210 
I G ATATcjGATAACCTAAAATCTGCCATTCAAAATGAAAACTC TTAAAGTT I'AAGAGGC f rTCAA<"GACCC 2U0 
CGCGAC TTTCCCACC ICATCG7TTT TCCCTTCTA PCTTATTcTAA I' ITTTTGTA" TTTGTG rCAGTT I GC 300 
360 li/0 380 3530 HCO 410 '170 
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TATATCC ICATCACC IT T TTTACCGT r r AATCTCTCTATAGTATTGTTCGGG rGGTTTCAAGATOAAACG '12f> 
N I rCTGTG rAGCATTTTTGAAACACAGcGxGAAAAA TGCAGAAAC TATCATCC'iGAATGCA C TAAGTGC '190 
AC TCATTGTCATGAC TCC I'AACCCCCTcGCCCCACCATTrGTCTCTTCACAAAT'CA I GGCACAAT rCAA BHD 
^^^iniT GTGrTGGATCTTSATCTCGTCrCrTTCCTTCT, ' f;TTTCG TTK 030 
TATi CuAATGCCA TGCATTTG fTGATGATGCGCACGATGG rCGCACACACTACAACAGATATGAfGGTGG 700 

'10 720 730 7«0 750 760 770 
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C rGTGCGTGGTGCAGAGCTCATCAATAAGrGATGGGGCACCGAGATG rGACTCC'ICCCATTA I' rCCTCTT 770 
GAGTCGCATCTTT TGTCTTGG CCCG7TGTCAG rCGAGCCTCAAGGCAGCCGTTCCGCCAGTGATCCCCTG 3 '10 
CTAAGAAG : TGTAGGfGTAGAGGAAG^AACG IXGGGTCCAAA ITTCAAGcaaccn ccqcqaqa tog tucio 910 
tcccgGGGATTGGCCAAACGACCCAAACGTATGTTTCGAATGATACTAACATAAC A TAGAACA T I rTCAG 900 
GAGviACCCTTGGc tagaactag fc^gatccgagc cctccca tATGACGACGTC AAA TGTAGAAr rGATACC 1C50 

1060 1070 10a0 1090 1100 ir.c 1120 
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AATCTACACGGATTGGGCCAATCGGCACCTTTCGAAGGGCAGCTrATCAAAGrCGATTAGGGATATTVC.c' I IW 
AATliATI ! iCGCGACTATCGACTGGTT-CTCAGCTTAT^AATGrGATCGTTCCGATCAACGAArrCTCGC I 190 
CJG^ATTCACGAAACGTrrGGCAAAAA-CACATCGAACCTGGATGGCCTCGAAACGTGTCrCGACrACCT 1260 
GAAAAATCTGGCTCTCGAC TCCTCGAAACTCACCAAAACCCA TATCGACAGCGGAAAC ITGGG T GCAG IT 137Q 
CTCVAGC TGCTCTTCCTGCTC fCCACC TACAAGCAGAAGCTTCGGCAACTGAAAAAAGATC AGAACAAAT 1 'iOO 
1410 1*120 M30 1W0 HJSO 1U60 1'I70 
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IGCAGCAACTACCCACATCCAT TATGCCACCCGCGCTTTC fAAATTACCC TCGCC ACCjTG'CGCCACG VC 1 W) 
A ^ AAt ? CG ': r I CAf 'CAAC rAACCCAAATTCCAACTr rcCACAAATGTCAACATCC^GHCTTCAGACTCCA 15';0 
,. AG J GAAtjAATA TC(3AA ^ T ^ATTCATCAAAGATrGGrATCAAGCCAAAGACGl :TGGACr: AAACCAC 1610 
^I3C-^S^ TCAACCACTTCArCAAA " rAATACAAATTCAT TCCGTCCCi ' CG*CCC<5 T TCGAC TGCiCAATAA I68C 
I AATGTTGijCTCGACGATATCCACATCTGCGAAGAGCTTAGAATCATCATCAACG TACAGC fC TATT CCG 1 760 
1760 1770 17flO 17U0 1800 1810 M20 
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^;^ A ^ < ;^ A $ < : TACCTCCCAAC "-"CAAAAACC TTC TaGACCACAAACCCAGC f AG ITCGTG i TGC TA ! 820 
£ AAC .[ AGAAAAATcGGAAGC ^^ 1U G0 
^rrrlrll^T^^ I960 
CTuAACAAC.AAAC fTTGTCGAAAATCGC TGCCCCAG TGAAAAG t'GGCC TG AAG CC s i CC G AC C A G TA A G l" T 2 ICO 

, ?V i0 2120 2130 2M0 21S0 2160 ?170 
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a ;. a I^ gaa - aa ^^I cgaaa cgatgctc:aaagagcagtgaagaagagtccgga-/ ( cg 2?",0 

rr*\r!SS l^**" fl - ATC A rGG ^'- G ^ AAGGTTCCC r AAGC A rCC AT fCC AC A TC TCCAAGAGT ICAAC 2310 

o')r AGAGGAAAAG '';^ ;:yso 

v-'..tiAlAGCCG'v.AACACCGG JTTCTCCAAATAT fATCAACAAGCC ' G T T C. A G G A A A ; . AC C A A C A C T G G C A G 2n<X> 
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TGAAA66AGTGAAAAGCACACCGAAAAAAGATCCACCrCCAGCT6TTCCGCCAC6T6ACACCCA6CCAAC 2520 
AATCGGAGTTGTTAGTCCAATrArSGCACATAAGAAGTTGACAAATGACCCCGTGATATCTGAAAAACCA 
GAACCTGAAAAGCTCCAATCAATGAGCATCGACACGACGGACGTTCCACCttC r rc:ACC rc rAAAATCAG ?660 
M G r rCCAC r rAAAA I'GAC r rCAATCCGACAACCACCAACGTACG ATGTTCTTCTAAAACAAGGAAAAAT /30 
CACAI'CGCC rGI"CAAGTCGTTTGGATATGAGCAGTCGTCCCCGTCTGAAGACTCC.\TTGTGGCTCATGCG 2000 
2810 2820 2830 2840 2860 2860 28/0 
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TCGGCTCAGGTGACTCCCCCGACAAAAACTTCTGGTAATGATTCGCrGGAGAGAAc^GArGGGAAAGAATA ?B70 
AGACATCAGAATCCAGCGGCTACACCTCTGACGCCGGTGr TGCGA TG TGCGCC AAAATGAGGGAGAAGC T ?M0 
CiAAAfiAATACGATGACATfiACrCG TCG AGG ACAGA ACGGC TATCC TGAGAACTTCt; AAGAGACVrTCf. VGC .10 10 
TTGrCGTCrGGAArArCCGATAACAACGAGCTCGACGACATATCCACGGACGATTTGTCCGGACTAGACA .*X>80 
I'GGCAACAGTCGCCTCCAAACA TAGCGAC f AT fCCCACTTTGTTCGCCATCCC ACG TCT TC T I'CC fCAAA 3150 
3160 31 A) 3 1 80 3 1 SO 3?00 3? 1 0 3"^ ?0 
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GCCCXGAGTCCCCAGTCGGTCCTCCACATCAGTCGATTCTCGATCTCGAGCAGAACAGGAGAATGTGTAC IMW 
AAAC r i*C TG rCXCAGTGCCG A A C f 3 A R C. C A A C f i fGGCGOCGCTGCC ACC ICAACC I* I'CGCiACAACA r rCGC 32G0 
FA AG A rCCCCGGGA IAC TCA ICC fA r l*C rCCACAC I' I'A rC AGTGTCAGC TGA ! AAGGACACAA \ G I'C I'A I 3360 
GG ACTCACAGACTAGTCGACGACCTTCTTCAGAAAAACCAAGCTATTCAGGCCAATTTCATTCACTTG A f 3430 
CGTAAA7GCCACCT rCAAGAGTTCACA" r CCAGCGAGCACAGAATGGCGGCTCrCT"GAGCCCGAGACGGG 3500 
3G10 3(320 3G30 3G40 30G0 3660 J'J/O 
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TO CCG AAOTCGATG I"CG A AATATGA T TC TTCAGGATCCTACTCGGCGCGTTCCCGaGG I'GGAAGC fC I AC 3570 
I'GG T ATCTATGGAGAGACGTTCCAACTCCACAGACTATCCGATGAAAAATCCCCCt.CACATTCTGCGAAA 3*40 
AGTGAGATGGGATCCCAACTATCAC7GGCTAGCACGACAGCATATGGATCTCTCAATGAGAAGTACGAAC 37 10 
ATGCTATTCGGGACATGGCACGTGACTTGGAGTGTTACAAGAACACTGTCGACTC/\CTAACCAAGAAAr.A 3780 
GGA6AACTATGGA6CA r IT, I" I* l*GA YC f !* I' I* fGASC AAAAGCT TAG AAA AC fCAC TC AACACA TTGATCGA 38*>C 

3860 3870 3880 3890 3900 3910 3<J20 
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rCCAACTTGAAGCCTGAAGAGGCAATACGATTCAGGCACGACATTGCTCATTTGAGGGATATTAGCAAIC 3970 
AT CT rGCA7CCAAC T CACCTCATGCTAACGAAGGCGCTGGTGAGCTTCTTCG" r CAACCATCTC 7GGAA PC 3GG0 
AGTTGCA fCCCA I'CGAI C A CCGA f'G i'C A ICG I'CG fCGAAAAGCAGCAAGCAGGAGAAGATCAGCTTGAGC 4300 
rCGTTTGCCAAGAACAAGAAGAGCTGG ATC^GCTCCTCACTCTCCAAGTTCACCAaGAAGAAGaAC AAGA UI30 
AC T ACG A C G A AG C AC AT A TG C C A f C A A T T TCC GG A TC T C A AG G AA C TC r TGACAAC A r TGATG^GATTGA ^200 

4210 4/U0 4230 4?40 4250 4260 4270 
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G T TG A A G C A AG A G C TC £ A AG A ACG C G A l*AG TGCAC rrTACGAAGTCCGCCTTGAC^ATCTGGATCGTGCC 42/0 

C(vCCiAAGTTGATGTTCTGAGGGAGACAGTG AACAAGTTGAAAACCGAGAACAAGCm AT rAAAGAAAG AAG '1340 

T GGACAAACTCACCAACGGTCC AGCCAC rCGTGCT TC T rCXCSCGCCTC AATTCCmG TTATC TACG ACGA 4410 

IGAGCA rG T CTATGATGCAGCGTGTAGCAGTACATCAGCTAGTCAATCTTCGAAAC:GATCC I'C I'GGC ! GC 'I'lflO 

AAC rCAATCAAGGTTACTGTAAACG TGG/CATCGC TGGAG AAATC AGTTCGA TCG" TAACCCG^AC AAAG 4b* : 0 

4G80 46/0 4680 4590 4600 4610 4620 
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AG A T A A T C G TAG G A I'A I C . I'G'.XA »"G K AACCAGTCAGTCATGCTGGAAAGACAT"GATGTT TC I'A I" I'C r 46?0 

AGGACTATTTGAAGrCTACC I'A rCCAGAA" TGATG rSGAGCATCAACTTCGAATCCl ATC.CTCC, CGA I I'C i" 4690 

ATCCTTGGC I'A I'C AAA rTGGTGAACTTCGACGCGTCATTGGAGACTCCACAACCA'GA TAAC'^AGCCA PC 4760 

CAACTGACATTC r TAG i I'CC I'CAACTACAA TCCGAATGTTCATGCACGGTGCCGC aCAGAG TCGr:.i TACA -i8X 

C A G T C T G G TO C T f G A "A T G C T TC TT C C A A A GC A A A TG ATTCTCCAACTCGTCAACrCAAIl r I'C-AC AGAG 4?30C 
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AGACGTCTGGTGTTAGCTGG AGCAACTGGAA I* TG G A A A G A GC A A A C I'GGCGAAGAC CC TGGC TGCTTATG '1970 
IA I'C TA f TCGAACAAA I'CAATCCGAAGATAGTATTGTTAATATCAGCATTCCTGA^ AACAATAAAGAAGA 60 '10 
ATTGCTTCAAGTGGAACGACGCCTGGAAAA6ATCTTGAGAAGCAAAGAA TCA fGCA TCGTAATTCTAGAT 51 10 
AATATCCCAAAGAATCGAATTGCATTTGTTG TATCCG f T f ITGCAAATG I'CCCAC I PC A AA AC A AC G A AG 5i30 
G ICC! A TT TG TAG rA rGCACAG l'C A ACC GAT ATC A A ATC CCTGAGC TTCAAATTCAC CACAA I* f I'CAAAA (' 5250 

5260 527C 5280 5290 5300 5310 5320 
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GTCAGTAATGTCGAATCGTCTCGAAGGA7TCA7CC7ACG 7 7ACCTCCGACGACGGG CGG7AGAG0A7GAG ii'J»; 
TATCGTCTAACTGTACAGATGCCATCAGAGCTCTTCAAAATCATTGACTTCTTCCCAATAGCTCTTCAGG 5390 
CflGTCAATAA I' J" I' I A f IGAGAAAACGAATTCTGTTGATGTGACAG 7TGGTCCAAGA GCA I'GC NGAAC I'G MSC 
TCCTCTAAC TGTCGATGGATCCCGTGAA" r GGTTCATTCGATTGTGGAATGAGAACT TCATTCC A f A l I Ifi 56X) 
GAACGTGTTGCTAGAGATGGCAAAA'NAACCTTCGGTCGCTGCACTTCCTTCGAGGA TCCCACCGACATCG 5600 

5610 5670 5630 5640 5650 5660 5670 

TC TC TAAAA AATGGCCGTGG TTCGATGGTG AAAACCCGGAGAATG TGCTC AAACGT CTTCAAC TCCAASA 06/0 
CC 1 CG I'CCCG PCACCTGCCAACTCATCCCGAGAACAC FTCAA rCCCCTCG AGTCGT TGATCC A ATTGCAT (wUf) 
GC TACCAAGC A TC AG ACC A I'CG AC A AC A T I' I'GAAC AGAAGAC rC rAArClTC ICTCGCX rCTCCCCCGCT 5610 
TTCCTTATCTTCGTACCGGTACCATCGTATTCATATCTGAGCTCCGCATCGGCCGCTGTCATCAGATCGC 6600 
CA TC rCGCGCCCGTGCCTCTGACTTCTAAGTCCAATTACTCTTCAACATCCCTACA TGCTC TTTCTCCCT HOfiO 

£960 5970 5980 5990 6000 6010 6020 

I I I I 1 I I I I I I I > I I I I I I I t I I > I 1 1 I I I I I I I i I 1 U k Li.l I t LjULJ *.L,L-U t.l.l . J. LL L. l .t „ LI 1 1 I > I I 1 , 

GTGCTCCCACCCCC l'ATTTTTGTTATTATCAAAAAAACTTCTTCTTAATTTCTTTG TTTTTTAGCTTCTT ttOXO 
TTAAGTCACCTCTAACAATG AAATTGTGTAGATTC AAAAATAG AATTAATTCGTAA TAAAA AG TCGAAAA 6090 
AAAT rGTGCTCCCTCCCCCCAr TAA TAA I'AA I* !"C f A1CCCAAAATCTACACAATG7 TC7G7G7ACAC7TC 6160 

r ; a ig r r r r i r r \'&crrc tgataaattttttttgaaacatcatagaaaaaaccgcacacaaaa tacc r i'a 6230 

TCATATGTTACGTTTCAGTT^ATOACCGCAATTTTTATTTC r TCGCACG VC I'GGGCC l'C TCA I GACG I CA 6300 
6310 6320 6330 6340 6360 6360 6370 
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AA TC ATGCTCATCG7G AAA A AG TT f I'GGAG fATTT T7GGA ATTTT TC AATC AA6T(i A A AG I II A I'GAAA r f>3 /<) 
TAATTTTCCTGCTTTTCCTTTTTGGGGGTTTCCCCTAT TGTTTGTCAAGAG TfTCG AGGACGGCGTTTTT 64UC 
C T T G C T A A A ATC AC A AG T A T IG A TGACCACGATGC AAG AAAGATCGGAAGAAGGTT TGGG77TGAGGC FC 6510 
ASTGGAAGGTGAG FAGAAGTTGA7AATTTGAAAG rGGAGTAGTGTCTATGGGGTT'F TTGCC I" I'AAATGAC 6580 
AG AATACATTCCC AATATACCAAAC A7AAC7GTTTCCTAC TAG J'CGGCCGTACGGC CCCTTTCGTCTCGC 6650 

6660 6670 6680 6690 6700 6710 H/VO 
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GCGrrrCGG IGATGACGGTG AAAAC CTCTG AC AC A TGC AG CTCCCGGAGACGG TC/ CAGC itg I'C I'GIAA QI'X) 
(Ml SG A TGC C GGG AGC AG AC A AG CC C G TC AC GGCG C G TC AG CGGGT GTTGG C GG G T<3 I'CGGGGC 7CGCT7A 6/90 
AC TATGCGGCATCAGAGCAG ATTGTAC TGAGAGTGCACCATATGCGG TGTGAAATa* CCGCACAGA tgcg r 606o 
AAGGAGAAAATACCGGArCAGGCGGCCTTAAGGGCCTCGTGATACGCCTArrTTVTAGGTTAATGTCAr 60'JO 
GATAATAATGGTTTC rTAGACGTCACGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCC PAT ITGTTTA 7000 

7010 7020 7030 7040 /ObO 7060 7070 
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TTT7TCTAAATACATTCAAATATG r A FCCGCTC A TGAGACAAT AACCCTGATAAA I CC N CAA I'AA FA 77 70/0 
OA A A AAG G A AG AG T A TG AG T A I" I'C A AC A T T TC CG I'G rCGCCCTTATTCCCTtT F f I GCCiGCA I I I I'GCt: f 7 r .0 
TCCTG7 r Tt TGCTC ACCC AG AAACCCTGG r G A A AG T A A A AGA T GC TG A AG ATC AG I I'GGG TGCACGAGTu /210 
GG I f AC A TC G A AC tC G A TC 7C A AC AG C GG T A AG A T C C I'TGAGAGT TT rCGCCCCGAAGAACGTTrTCCAA T?M 
TGATGAGCAC r !' I" fAAAGTTCTGC i'A I GTGGCGCGGTATTATCCCG TA T TGACGCC 5GGCAAGAGCAAC I /JOO 
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CCSG I'CGCCGCA [AC AC V AfTCTCAGAATGACTTGG TTGAGTACTC ACCAGTCACAC AAAAGCATCTTACG /420 
GATGGCATG ACAGTAAGAGAA FT A FGCAG IGCTGCCATAACCATG AGTGATAACAC TGCGGCCAAC 1" I'AC 7<t30 
TTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGCGArCATGTAACTCG /680 
CCTTGATCGTTGGGAACCGG AGCTGAATGAAGCCATACCAAACGACGAGCGrGAC^CCACG A PGCC TGTA 7630 
GCAA rGGCAACAACGTTGCGCAAAC TATTAACTGGCGAACTACTT ACTCTAGC TTC CCGGC AACAATTAA 7 700 

7710 7720 7730 77'K> 7760 7760 7770 
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TAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCCCCTCCCTGGTTTAT ///O 
TGCTGATAAATCTGGAGCCGGTGAG CG TGGGTCTCGCGGTATCATTGCAGCACTGC GGCCAGATGGTAAG /H4<> 
CCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACC A A ATA G AC AG A TCG 70 10 
C rGAGATAGGTGCCTCACTG ATTAAGCATTGGTAAC FG TCAGACC AAG F F FACTCa* FA FA F AC f r TAG A T 7980 
TGATTTAAAACTTCATTTTTAATTTAA AAGGATCTAGGTGAAGATCCTTTTTGAT/ ATCTC ATC-AC.CAAA BOW 

0060 6070 OOnO 0090 8 ICO 0110 0120 

A I CCC I I'AACG I'GAG IT ITCC5I* I'CCAC I'GAGCGTC AGACCCCGTAGAAAAGATCA^ AGGA I'C I i f J ! I GAG 8120 
ATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACC/>GCGGTGGTTTGTTT 8 ISO 
GCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGC AGATACCAAATACT 
GTCCTTCTAGTGTAGCCGTAGTTAGGCCACC ACTTCAAGAACTCTGTAGC ACCGCCTACATACC TCCCTC BlttP 
TGCTAATCCTGTTACCAGTGGCTGCTGCCAG TGGCGATAAG I CG FGTC T FACCGGG T^GGAC TCAAGACG 8UCC 

8*410 8420 0 WO 8'PIO 8'I60 8'I60 fl'170 
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A ;*AGTfACCGGATAAGGCGC AGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGA 8M70 

ACGACCTAC ACCGAACTGAG ATACC FACAGCGTGAGCATTGAGAAAGCGCCACGC ; TCCCGAAGGGAGAA 85'I0 

ACGCGGACAGGTATCCGG TAACiCGGCAGGGTCGGAACAGG AGAGCGCACG AGGGA<i C T !*CC AGGGGGAAA 8610 

CGCCTGGTATCTTTA TAG I CC I'GTCGGGTTTCGCC ACCTCTGAC T FGAGCS I CGA f f FTTGTGATCCTCC 8680 

TCAGGGGGGCGGAGCCTATGGAAAAACGCCAGr.AACGCGGCCTTTTTACGGTTCCTGGCCTTTTCCTGGC 87H0 

8760 8770 0700 8/90 U800 0*310 8820 
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C f rf TGCTCACATGTTC TTTCCTGCGTTATCCCCTGATTC TGTaGATAACCGTAT"! ACCGCCTT TGAG i'G OfiX) 
AGCfGATACCGC rCGCCGCAGCCGAACGACCGAGCGCAGCGAGTC AGTGAGCGAGC A A G C G G A A3 A G C G C 3690 
CCAATACGCAAACCGCCTCTCCCCuCGCGTTGGCCGA F i'CAl I'AA'FGCAGCTGGC/ CSACAGG TTTCCCG 80«0 
AC rGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTACGCACCCCAoGC I I ! S030 
AC AC TT f A IGCTTCCGGCTCGTA FG TTGTGTGGAAT FG IG AGCGGATAACAaMTTC AC ACACGAAACACIC Q10C 
Olio 9120 9130 9^0 9160 G160 'V fO 
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AAriCTTGCATCCCTCCAC6TCGACTCTAGAGGArCCCCGGGArrGGCCAAAGGACCCAAAG9t<ikqtt tc 70 
gaat gatac taaoatuuuacagaacat 1 1 tcaaGAGGACCCTTGGAGGGTACCGG'I AGAAAAAATGAGTA 1UO 
AAGGAGAAGAAC T ITfCAC PGG AG rTGTCCCAATTCTTGTTGAATTAGATGCTGA f G I' PAA PGGGCACAA 2KJ 
ATTTTCTGTCAG TGGAGAGGG PGAAGG I GA PGCAACA TACGGAAAACTTACCCTT/i AATTTA I' I' I'GCACT 280 
AC rCGAAAACTACCTGTTCCATGGgtci^tttnnnccjtototcjtcrttartc fcaaccc tgottcit ttooo t *: 3G0 

jbo :* /o a«o 390 moo a 1 o 420 
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tt oagCCAACAC r rGTCACTACTTTCTgTTATGGTGTTCAATGCTTcTCgAGA fAC CCAGATCATATCAA 420 
ACcjGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTA TG VACAGGAAAGAACTA TATTTTTCAAAG A P 4*30 
GACGGGAACTACAAGACAC^taagt ttaaacagttcggtactaac taaccutacafcat t taaatt t tcaq 660 
GT GCTGAAG TCAAGTTTGAAGG TGA S'ACCC PG I' TAA [ AGAAPCGAGTTAAAAGGT ATTGATTTTAAAGA 630 
AGAfGGAAACA P TCTTGGACACAAATTGGAATACAAC7ATAACTC ACACAATG PA f ACA PCA PCCCAGAC /OO 

710 720 730 7M0 760 760 J JO 
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AAACAAAAG AATGGAATCAAAGTTg taag 1 1: taaac t tggact tac taat: tauc;g<;ctfc tutct ttoaatz 770 
ttcagAAC r PC A AAA r r AG AC AC A AC A PTGAAGA TGGAAGCG T PC AAC TAGCAGACOATTATC AAC AA AA 8UO 
TACTCCAA P PGGCGAIGGCCCI G1CC PT P PACCAG ACAACCATTACCTG FCCACAC AA PC 1 GCCC P I PCG 910 
AAAGATCCCAACGAAAAGAGAGACCACATGGTCCTTCTTGAGTTTGTAACAGCTCCTGGGATTACACATG 0U0 
GC ATGGATCAACTATAC AAATAGCA^TCGTAGAATTCCAACTGAGCGCCGGTCGCT ACCATTACCAACTT 10**0 

1060 1070 1080 1090 1100 1110 1120 
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GIC PGG PGl'CAAAAATAATAGGGGCCGCTGTCATCAGAgtaagtt taaac fcgagt to t«r:fc««<: kaor.ga 1 120 
ulmalal tfcaaat 1 1 tcagCA PCTCGCGCCCGTGCCTCTGACT TC PAAGTCCAATT ACTCTTCAACATCC t 190 
CTACATGCTCTTTC fCCCTGTGCTCCCACCCCCTATTTTTGTTATTATCAAAAAAACTTCTTCTTAATTT 1200 
CI P l'G T T TT TTAGCTTC TTTTA AG f CACCTCT AAC AATGAAA P TG TGTAG ATTCAA AAA TAGAA P TAATT 1330 
CGTAA PAAAAAGTCGAAAAAAATTG PSCTCCTCCCCCCA ,'TAATAATAATTCTA'f CCCAAAATCTAC AC 1400 
lillO 1M20 KI30 1UU0 1450 KWO I '170 

■ ... I .... I .... I i t . . 1 ... ,J ... 1 . . . ■ 1 . ■ . . I l UL.I I t I I I 1 lllil 1 I I 1 t I I I 1 I > t I 1 . 

rGTTCTG PG I ACAC^TCTTATGTTTTTT PPACTTCTGATAAAPTTTTTTTGAAaJCA PCA7AGAAAAAA 14/0 

• - - — ----- - - ***** * -j ^no 




TAAAVC-CT TCAATA f ATTGAAAAA6GAAGAGTATGAC fATTCAACA I I i CCG rG JCGCCC \ ^ JO.C jjJW 
^TTTTGCGGCATT f I'CCv I' l"CCTi I'TTTGCTCACCCAGAAACGCTGG TGAAAG' \AAAuA iGCTviAAGA > 
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TOAGTTGGC I'GCACGAGTGGG FTACATCGAAC FGGA FCTCAACAGCGGTAAGA ICC TrCAGAGTTTTCGC 2620 
CCCGAAGAACGT rrrCCAATGATGAGCACTTTTAAAGTTCTGCTATGrGGCGCGu 'ArrATCCCGTATTG 2600 
AC.'GCCGGGC AAGAGCAAC I'CGGTCGCCGCATACAC fATTCTCAGAATGAC r rGOT'GAGTACTCACC AGT 2660 
CACAGAAAAGCA rCTTACGGATGGCA I'GAC AGTAAGAGAA I" I'A IGC AGTGCTGCO.TAACC ATGAG TGAI' 2730 
AACACrGCGGCCAACTTACr I'C fGACAACCATCGGAGGACCGAAGGAGCTAACCG; T f f I* I'TGCACAaCA 2800 
2810 2820 2830 2840 2850 2860 28/0 
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i GGGGGA7C ATGTAACTCGCC I' TGA7CGTTGGGAACCGGAGCTGAATGAAGCCA f ACCAAACGACGAGCG 2870 
rGACACCACGATGCCTG7AGCAATGGCAACAACG7 T GCGC AAACTATTAAC TGGCt» AACTAC T I'AC TC FA 2940 
GCTTCCCGGCAACAA I fAATAGACTGGATGGAGGGGGATAAAGTTGCAGGACCAC** TC TGCGCTCGGCCC 30 !0 
r I CCGGC TGGCTGGTTTATTGC HiA I AAATCTGGAGCCGGTGAGCG TGGG TCTCGCGGTATCAT TGCAGC 3080 
AC TGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAG It AGG C A AC TAT G GAT 3 ISO 

3160 3170 3130 3100 3200 WU) [MO 

1 ■ 1 1 1 i * 1 * I 1 * » i t i ' ■ 1 1 1 i i i 1 ' * ■ ■ 1 * i ' « I i i i i I « * ' i 1 i i * * 1 i • i * 1 i • 1 i l * i i » i i i « « I 
GAACGAAA I AGACAGATCGCTGAGATAGG fGCC TCACTGATTAAGCATTGG rAAC'GTCAGACCAAGTTT 3220 
AC VCATATATACTTTAGATTGA I" P i"AAAACTTCATTTTTAAfTTAAAAGG ATCTACiGfGAAGA I'CC NTT 3?:"30 
TGATAATCTCATGACCAAAA rCCCTTAACGTGAGTTTTCG nCCACTGAGCGTCAGACCCCG 1'AGAAAAG 3360 
ATCAAAGGATCTTC T TGAGATCCTTTTTT^C fGCGCGTAATCTGCTGC r I'GCAAAt AAAAAAACCACCGC 3'130 
TACCAGCGGTGfri ITS I* TTGCCGGATC AAGAGC MCCAACTCTTTTTCCGAAGGT/\ACTGGCTTCAGC AG 3500 
3010 3520 3530 3540 36«) UbBO 3570 
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AGCGCAGATACC AAATA'J I G IT.CTTCT AGTGTAGCCGTAGTTAGGCCACC AC ITCaAGAACTCTGTAGCA 30/0 
CCGCCTACATACCTCGC IX I "GCTAATCCTGTTACCAGTGGCTGCTGCC AG TGGCG#vTAAGTCC.TGTCTTA 36'I0 
CCGGGTTGGACTCAAGACGA^ACiTTACCGGATAAGGCGCAGCoGTCGGGCTGAACCiGGGGG f \ I'CGTGCAC 3710 
ACAGCCCAGCTTGGACCGAACGACC^ACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAGCGCC 3700 
ACGC r rCCCSAAGGGAGAAAGGCGGACAGGTATCC GGTAAGCGGC AGGGTCGGAACAGGAGAGCGCACGA 3050 

3860 3670 308C 3890 3900 3910 3920 
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GGGAGC I rCCAGGGGGAAACGCCTGGTATC CTTATAGTCCTti fCGGGTTTCGCCAKCTC i'GAC ITGAGCG 3920 
rCGAI I'TTTGTGATGCrCG I CAGGGGGGCGGAGCC TAT5GAAAAACGCCAGCAACGCGGCCT7T7TACG6 # J»W 
TTCCTGGCC rn s'GCTGGCCTTTTGCrCAC ATGTTC I* ITCCTGCGTTA TCCCC TGATTCTGTGGATAACC ftOGO 
GTATTACCCCC I* TTGAGTGAGC TGA f ACCGCTCGCCGC AGCCGAACGACCGAGCGCAGCGAGTCAG7G AG 4 130 
CGAGGAAGCGGAAGAGCCiCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCG^ ITCATTAATGCAGC A2C0 

4210 <\22Q '1230 4240 4260 '1260 42/0 
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TOGC ACG AC AGG T T7CCCG AC IGGAAOGCGflGCAG TGAGCGCAACGCAATTAATGTGAGT fAGCICAC It 'liY/O 
A f r AGGC AC C C C AGG C 7 rTACACTTTATGC T rCCGGCTCGTATfi r I G r GT6GAAT rGTGAGCGGA FAACA 4340 
A^TTCACACAGGAAACAGCTATGACC A rGATrACGCCAAGCTq toaqt 1 1 aaaca t qa I Lacboac to 
ac to v tc teat i taauL 1 1 1 cagAGCTTAAAAA I GGCTGAAATCACTC ACAACGAFGGA TACGC TAACAA 
C TTGGAAA I'GAAA I" <M9<1 
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CTAAATTGTAAGCGTTAArATTTTGTTAAAATTCriCGTTAAArTTTTCTTAAATCAnCTCATTTTTTAAC 70 
CAA I AGGCCGAAA fCGGCAAAATCCC7TATAAA7C AAAAGAA7AGACCGAGA7AGG 3TTGAGTGTTGTTC 140 
CAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACG ir.AAAGGGCGAAAAACCG TC PATTA P 10 
GG G C GATGG CCC AC T AC G 1*6 A ACT A I'CACCC r AATCAAG777TTTGGGGTCGAG6T SCCGTAAAGCAC fA 7X0 
AATCGGAACCCTAAAGGGAGCCCCCGAT77AGAGC77GACGGGGAAAGCCGGCGAA '"GTGGCGAGAAAOG 3G0 

y«> :*/o :*ho -juo 400 mo m^o 

1 * 1 1 1 I 1 * ' 1 I I I 1 1 I I ' I I I I I ' 1 I l.A.I [ » i i i I i ■ i ' < ■ > « i I i . > « I . . . . t ■ i , , \ .... t .... t 

AAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTG TAGCGG TCACG : I'GCGCG I'AACCAC M20 
CACACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTCCCA r rCGCCA T IT.AGGC I GCGCAAC IG I i'GG 4S0 
GAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGC fGGCGAAAGGGGGATG TGCTGCAACJGCGAT S60 
7AAG7TGGG7AACGCCAGGG7777CCCAGTCACGACGT7G7AAAACCJACGGCCAG 7 jAGCGCGCu I AA I A 620 
CGACTCACTATAGGGCGAAT TGGAGC fCCACCGCGG7GGCGGCCGC7C 7AGAAC TAG rGGATCCCCCGGG 700 

/10 720 730 7'I0 750 7G0 770 
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CTGC AGG AATTCGA fA FCAAGC 77ATCGATACCGTCGACCTCGAGGATCAGAAGAA AT TGGAGCAACTAC / /O 
CC AC ATCCATTATGCCACCCGCGG I' I" f C TAAG TGAGTTTA A r r r rGAG J TTACGAC TACAA AAATGTG r i" 3^0 
CT7TAATAACTATC I I CGACTTGAGTCTATTC TGTATCAC7AG77G ITGAGTGATTTTTCATTGAGAAAA 010 
TATTAAAAGGAACATTATTTACTTrGCrTATTTGCCCTAACrrTGATTTAGTTTTTCGArCAACTAGATC 9flC 
rTACAAAACTTGCAATACAATTCCAlTTTCAGATTACCCTCGCCACGTGTCGCCACoTCAGCAACCGCTT 1060 

10G0 1070 I OHO 1090 1100 mo 1 120 
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CAGCAACTAACCCAAAT TCCAACTTTCCACAAATGTCAACA7CCAGGCT7CA6AC7CCACAGTCAAGAA f ! !?0 
A I CSAAAa I rGGTAAGAATTTTA 1T77GAGCTCAAAC7TG7A7AAAA7GCCCA3AAAAGAAGA TGA I AAA 1 ISO 
AATGTAGTTT r f f TGCAAAACTTCC ACCT77A77GC rCTAATATGACGGCTTATATCTCAA r T TTC TTGA 1260 
G7 TTTATC AAAAAA T TT7CC AC TA I* AC AAA 7G TAG AAA AG T A7 77 7GC AC AAA T7 7 7G TCAG7TGACAGC 1330 
TT7GTAATAGA7CCAAA7GGAACCTAGA7ACAAGC7G I' I* A AAG 70 G AAG G AG C GC A AG 7C 7 A 7 AC T G (1 A A 1 MOO 
V1 10 1420 1'!30 1 440 1'I60 1460 1M70 

ATAA I GATC7GAAACAAAT7TGTGC7A77C IXAAATG T77AAGACA7GT777GAAG A7777 f fCAAATTC 1 '170 
GC AC7AG F77CAGAACC77CC7 f r TTG"TATGAAAAAG FAAAAAAAAAC7A77TCAA ACCC I'CACCGCCAC 1640 
CATG77 rcAAC7CTTAA77TTTA7AAAA77T7GCAA777ACAAA7CGCC rCCCC7*f GCCCGAAAAG fGCC 16:0 
CACC AAAA rCAATTTC TCGGC X TCA TAATGAC 7777AAA f TGATG TGAGAAAACAC AG AAG AG GC I'AAC \ 1 030 
AAAITGACAGGGACAGG7TG7rf:C7C77CrCCC7CC7TC7CCCGCC7CC7CC7CCCCI ICCA7C7CCAAC I750 

1/t>0 17/0 1780 1790 1WX) 10 10 ItttO 
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AAC'AAC AA777TCCAA I f 7CGT fGTCC A TT77GC T7A7AAACA I I" fUTGTGTGGAAGGAAAC I'ACACGUG 1320 
GAGACGGTCAATTAA rrCGAAT3AGAGCA7IjGCAAT7ACTCT7TCGGAAATTGATC.AA i'AAAGATAGAGC 1U**0 
CGA7GACAC7GGC7GGI AG7AG7A7GAG7GTAGAA STGCT T ITTCATCGTCTC ^;IISCnCATCAGTCJ \MQ 

rc 

GG 



rCCCCCGCTC! : CA7CAC7GACAAT7AATGTCGGG7777ATGCGC7CJ77CCrAT7C 20UJ) 
GGTrACCACAAACTGGAATACATTTTACTACTATTCAAGCCATTrATTTTGATA. 7AA i 777^ TGv-AA 2100 

pi l0 2120 2130 ^140 2160 2160 21/0 
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TAGGGAT AAACACGACT I TT AAAAG TTTA~ TTAAAAAAACGA \ A r FTTCG A [.H : AAAAAA 1 1 C rfiAAAAO * I A) 
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AC ATT'CAGCCAG I' rCGTACAATTTTCCATGCTTTTTGGCCCATTAAAAAAC I' TTCTCACCTC T7CA IT (A 7G20 
TC TCACTCG TA TC A I'AAAAAG I A I'AGCAAAAGCCCGAC TC TAC TTTTTAAGAG AAGGAGA T AC TGAGcVa 2500 
OA IGGCG TG TGACCCTTTTCATCTCGTCXCTTCGGTCTCAAATTCACGC rCATACTAACTCTTCAAA f AG 2650 
CC ATAGACCTCCTTGTTTTCTTCTTCGTTTTGACTCGCGCCTATTTTTTG rGGCTGCCTGAAAGCCGGGA 2700 
AAA ITTAGTATATTTATGAGCTTATCTTTATGCAATACATAAAAAACGAGGCAATTTAAAAATA I IAAAA 2800 

2R10 2820 2000 2040 20GO 2800 28/0 
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' ^JGTAGATGTAGATTT'GGAAAAGAAGAAAAAAAC AAAACAAATAGGAACCGCCAGATCAAA 2070 
A U$I A -Ix AAA ^If r,rCAACA ^^ 2940 
ATCTAG IGTAACG fTTAGATTGAAC TCGGAAA TCCTAAGCC I" 6 A A C T A T A G C C TT A T T C T A r, A I ' f r TAGT 30 10 
Ir^$ A T^ A 5 C I CAAW:CC:AAGC;AGAAATGACTTGCATTTAGTTTAAG CC IAGATTGACTTGCTTGC I I C 0080 
AGrCrAATCCAGACTAGATTTCCAAGAGAGTTTTCAATTTTAAATGTTTCCAGrrrcrrGTTACTTAAAA 0150 

3160 0170 0180 0190 3200 3210 0220 

*-*■*■*■* -i-ti i i I n ■ i I ....».■■■ i i . ... i i ■ . . . i . . . . , . , 1 , , 

J G J GA J G ^ G ^ AAA ATCGTTATCCC rrrCTCTCACACTTCAATTACAGATTCATCAAAG 0220 
A JIJ5™JC#3CCAAAGACGTCTC^^ 3290 
rII*»ITS?STS5r5i rCGAGTGGCAATAATAATGTTGGC TCGACGATATCCACATCTGCGAA 30f>n 

AGCTC lATTrCGAATC i AAACCGACCTACC'CCCAACTCCAAAAACCTTCTAGACC ACAAACCCACC TAG 0600 
3610 0520 3530 3540 0550 3C60 y>/Q 

't ,' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' 1 . i 1 . i ■ i I I | ■ ■ ■ . f ' 

!l GG I G n!fSI AGAACTACAAAAATnGfiAA fi(- rCAAAGCTAGCCGCTCCG AAAGCC 3 CGAGCACCCCAAA lM>/0 
X; '.I G 9IZ C ' ? ! GAAGACrATT GGAGCAAAAGAAGAGGGGGATAACAGCGGTGGTG3rGGTGGTGf>AA IG 360T5 
"l^ AAA II A ^ GTTArTCAGTAKCAAAAA ^Cf:ATCTTCCTCATCGAATAGCCCACAACCrACGAGAAAGG 0/10 
^^ GG, ; GGTljCCTCAACAACAAACTTTGrGGAAAA ^GCTGCCCCAGTGAAAAGT",GCCTGAAGCCGCC 3700 
GACCAu TAAGCTCGGAAGTCCC ACGTCTATG TCGAAGCTTTGTACGGTGAGTATT1 TiAAA TCGfiAAA rTG OfiW 
0860 0870 0880 3890 0000 3910 0920 

1 1 1 i i i i I i i i i I i i I ,, i I i i i i I i i i i I i i i i I i i i i | i L^ .t.,.1 ,. , , , I . ... i .... i 

GAAA rGTA T TTTTTAAAAAC TGAAAATTCTAC AAAATAAA I' I'AAAAA r AAGATTTT TTCCTCTGATAG ( A -'020 
IIa$ A I CCCA S T ^II TTAC r rrGAA G ATTT ATATCTTGGTTCA7ATTGAAGATATC AG A I' A IAGAAAAAG 3S90 
AAATAAAAAATATTTTGACAGT TGA7AAT •" ( X PG TA I'AGGACCAAAGACAAGTGAG AT ATAAGCTGTC AA '1060 
ACiTTGATTTTCAAGAAATTTTAAAACCCTAGTTTTGCGAAGCTCTGGGCCTCATCT ^ATTTAGAACCCG 4 100 
ATTCGTAC i I C f rcCGTTCC ITGAC TC TACCAAAACCAAAACCAACCTAC TAATAA XAATGATGAGACAA 42C0 

4210 4220 4230 4240 4260 '1260 >V /Q 

• ■ I »■'■ I ■ i ■ ■ I i ■ ■ ■ I ■■ ■ i l i ■■ i l .... I .... i .... i ■ ... i , ... i . ... i .... i .... i 
TTGGGAAT fG IT ICCCA I I I I C I IT I I' IC I CC r I GACAC CC T X rCAGAATCTATG 1 " ZCCATf TTTTTCTC «2 70 
U ! IG 'CrcCCCCCATAAAGACTCTTCGCGGAAAAATGTTCCAACGGAAGTGATATT 'CGAGCA I IT I rCG 4TI0 
ACGTCuAGGGCCGAAAAACACA TC TGGC TC, AC AAAGAG TAAAGCAA X X XC fCAGCT ! TTC TTCGCCGGTT 4U HJ 
T i TCAATTCGTTT7TCAAAATGAGCTACTA.AGAGTGAAAGAGCACAAATT6CAAA \CATT I T IG I G ' «A W'.QQ 
it A I GCAC TTTT7GAAAAA7TAACTTTACGTTT TCAG T TTC TARTATTTATTTTTTT :ATAT AAATTAGAC UfifiC^ 

4660 4570 4580 4590 4600 4610 4620 

1 1 1 1 1 1 » 1 1 1 1 ' ■ ' I ' ■ ' ' I ■ ■■■ I ... ■ I .... I ... . I .... I ... ■ | , . . , | .... | .... ! .... | 

C ( I C •• IAGACC I GC I A I A r T IT i IAAA AC I - ICC TAC TG AAA TATACGAGA I IC I f I'GAC C • :'CCGGAA X 4620 
IG I C I' TA FGGCT PCTA X TAf T XA CGASAAA ACATTTTTTAAAAATTTT 7T TGAAAA XAAAAC '"G fKCA TC 4830 
TTCG7TTTTTACATAGTAATTTCCAGCCAAAAGTTTCCTACCGTAAAACGGACGCC ICAATCATATCTA U /riO 
ACAAGAC TCGAAACGATGC TCAAAGAGCAG "GAAGAAGAG ICCGGAI ACGC I'GGA I I'CAAC AGCA'.'G rr.G l\cM 
t.CAACGTCATCATCGACGGAAGCTTCCCTAACCATGCATTCCACATCTTCCAAGGT rCGTTG'TTAOGAG 4G00 
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49 10 4920 4930 4940 4960 4960 49/0 

1 I 1 I 1 1 I. L I I 1 I I I I I I 1 I ll.lt I I I I 1 111.1 ■ I 1 I I I I I I I 1 1 1 1 t . . . . < .... I . . . . | . . , t \ 

AACTCTTTTTTTTTGTTTTCGTGACCTrCACATAGTCTCGGATGTTTATAAAAGTGAGGrcrCTCGCACA 4970 
CCTGCCATAAAATG T GAA TCCGCCCA I' r PG r rGGTACAAAAAACTTTGCAGAGCACC I'GCT PTATAr ATT ZQU} 
TTTAGGATAAA PG r CA PACGG PA P TTGTCAAACCC AAACTTTTTAAATTTTA T |" V I'CAGATCAAAAATTG 51 :0 
A FG rTAAAAGTTTAAGATATTTACGAAAAAATGTTTTACTTAAAAC rr FTTTATATGGATAAAA fTTTAG 510O 
AACATTCAGA fAGGASTTCCGTCCC TAAAACTTTTGTGGTCGCCGAGAAGCTTTCGAATTAATAA I C TCA 

6260 6270 5260 Mm 5300 6310 5320 

■ L 1 I I 1 I I I I I I I I I I I I I I I I I I I 1 I I I I I I I L-uLl.Li.L t I I I i I I I I I I I I I 1 I , , 1 . I . . . . t . ■ . . 1 

T I' I I'A TAAG rTCGAAACTAATTTTTTGTGCAACTCi AAT f T ITAGAGATGAAAC TTT AAGTT I CAA PTTAT SittO 
CCCAT r rGAAACCGTCCCCTTCTATAAAACTTCAAAAI' rr rCAGAGTTCAACGTCAGACGAAAAGTCTCC b'JUO 
CTCATCAGACGATC f I'AC rCTTAACGCCTCCATCGTGACAGCTATCAGACAGCCGATAGCCGC AACACCG 5460 
GT '* rCTCCAAATATTATCAACAAGCCTGT TGAGGTGAGTATTTTTTGTTTCTGGG r AGAGGCTTCTTG T'C 5630 
AAGTTTGGCCTAAATTATA V I'AAC r TGGT TTAGAGGCTGGCAAAGCCATTGA I CAA3CATGGG0TAAAC r 56CO 

6610 fcBW SH:k> 6640 5660 6660 bC5/0 

i ' i i I i i i i I i t-i.t-l i i i i I * < » * < t i i i I . t i i i i i * . i i i . . i . . . . I . . « . i , i i . i . . . , i , , . . f 

GGGCCCGCCTGAACCATGTACA fAITC r iTGGCCCGAGTAGTTGCAATCTAAAGAT rCGAAGCTGGC <" I C 5670 
A A AG I'CGGACTAGGCAAAAG TGCAAAAA IGGAAAA rAFC TTGAAATTCAATACGCT P TCCGTCTTTCCAT 57W 
rC rTCCTTTTTTGTCGTGTTTTGTfGCAGATTTTCCCTTTTTTAGATTTTAAGATTrtGATACACTT I AA 58 m 
TGTCTGGCr rCGCCTTCCTAAGAGCCTTCCTATAT rrrCGAAAAATAATCAATTTTTAGGAAAAACCAAr. S6&0 
AJ1TGGCAGTGAAAGGAGTGAAAAGCACAGCGAAAAAAGATCCACCTCCAGC rGTTC ZGCCACGTGACACC 5950 
5960 5970 SflAO 69Q0 6000 6010 6020 

i I I I I II 111 III! t.Lt I I t t 1 l.l i ■ i i i 1 i ... 1 . ... F ...» I , . . , I .... I , A , . 1 . . . . t ■ . . . t 

Cagccaacaa rc.GGAG r re; r rA gtc ca att atggc acata ag a ag ttg ac a aa tgg tacgtttatttc i*g 6020 

AACI I SAC r 1'ATG rTrCGGTC(S(ST6AC6TTTTT6TTCiAC£ATClT0AT6Ci^ AAG rAATTTTTGGATTATTT 6090 
AAAGTTTGTGCGGGAA TAG f'AAGGAGG AGTACAATTATTTTATTTGTAGA AGGTTC rAAAACTTTGAT 6 160 
r r TTCTGACCATAAGTTTTTTT TC PGAAAG rTGTTTAAAAAATTCAGTTAAAAAAT &AAA I AA PAC TC T A 6230 
TAAAAAN! C I AAA rTCTTGGGAATTTTTTTCAAAAATGTTTTTCCAAA r A TG TCT ^ATAGTAAGATTTG 6300 
6310 6320 6'J'JO 6340 6350 63G0 6 J 70 

1 I t l I UULjJ I I I I I UULjJ t I I I I I I I i I I I i I I I I I I I I i I i I I i i i 1 u-l.i. K..I ' \ « » ■ I I « ' » I I 

r r rGTGAATTTACAAAACATAT TTTAAAA PAC ATT T I AAT I' TA P PC6ATTTTTTCG PTCGCAGGAACGAC 637C 
AAAAAATCAGAAAAAGCGAAATTTAATTCCAAAAAAAATATTTTGAAAAC PCACAA a PAAAGCTACTttt 64a; 
CA AAAAATC AAC AAA AAAAATATO.AAAAGATTCAT AT7TTCAGAAATAGAG ACATA * TCA AAA AC PA.CA 66 IC 
AAAAATTCACTATTTTCCSGAAAAAAAT7GAGAAAAAT TCCAAAAA ITG f AAA AAA * A A A A T T G A.", A A A A 6WK3 
AT I'CCAGAAATTGAAAAAAAATTCCTTTGAGGAAAATTTAAAAATTTTAAATGTGT jATTTCTGAA ACJ.'A 66F0 

6660 6670 6680 6690 6700 6710 6720 

1 i i i i I i i i i I I t t i L t t t i [ i i t i I > i t i I t i i i 1 i i i i I i i i i I i i i i I i i i i I a i i i t i i i i I 

AGCATTTTCCGACTTTTC^GCGATTTTCAGACTTGGGCTATAAATTTTTGTCAAAA :rAGf;AAI'C:r TAAA 6770 
A i A T TTCTATTTTTCCAACAATCCTCCTCAATCTC AAA TTC AT ATT TT AT A ATTT*.: \G AC C C C G TG AT A T 6 /'JO 
C i GAAAAACCAGAACC "G AAAAGC7CCAA rCAATGAGCA TCGACACGACGGACGT r XACCGC TTCCACC G8C0 
TOTAAAATCAGTTGTTCC ACTT AAAATGACTTCAATCCGACAACC ACCAACGTACG \TGTTCTTCTAAAA <:.S?.0 
C A AG G AA A A ATC AC A TC G CC TG TC A AG TC G ~ T TG G T C A G T 6 C AC C C CC C A C C TCC A VPTAT PA TG AC AAA 70(^0 

7010 7020 7030 7040 7050 7060 7070 

I | ■ i 1 i i i i t ..ill ■ ■ i i I t i t I I * I I I 1 L.L-1-lJ I ' I 1 I i i i i I t i . i I .... I . I .... I 

TG A C C AT TT TG C A GG AT A T G AG C AG TC G TC C G CG T C TG A A G AC TC C A T TG T GG C T C \TG C G T C CO C T C A G 7070 
G PGACTCCGCCGACAAAAACT TCTGG PAA PCAT !*CGC l"GG AGAGAAGGATGGGAAA ;AA TA AGACA -'CAC-: 71W 
G PAAAP rTPGGAAAC T f TGaT P T T T T PTC : rTGAAAAATAGCTTCAAATTATAAATT r T A A A A A A T C C C G /V> -0 
AAAAATGATGTTTG T CAAAGGAAAACTTTTGATTTTTTG6T PTCTGAACTG r T TCG r P PA A AG T \?\&CX(\ 79ft\ 
ACG I'TGGAGC I'CG PACCAAAAACT7TTTCT7TTGATAATT TTTGAATCTATAGTAT ! T A A T T T "TGG A A /3-rO 



WO 98/24810 111/270 PCT/EP97/06956 



Friday. 28 November 1997 12:03 n anm 
pBa KS/X16 age 

/380 7370 7380 7390 7HGO /'HO /uzq 

■ ■ ■ ' 1 1 ■ «' i ' ' ' ' i » 1 1 1 1 1 » 1 1 1 1 1 1 1 1 r .... i , . i . . , . i , . . r 

TTGTGAAAG ITC fCTTGAGATGTAT TAAG I T TTAGGCATAGGCAGG rGTGTAGGC/.GAAAGG TATCATG r 7'l?0 
ACiCiCAGATAGGCTTGAI A I T TACCAAGCCAATAAACAG rAAATAATATTTAAAAA,* AAACAf VGAATAAA 7400 
ICAAAAGCTAATAAI IA rrGTTTATTGGACCTACCAACACCTTACATTTGCCTAC/ RiCTTArCTA I" ' C'C 7S(k) 
ITGTTGTCTACArrTTGAACGTTAArCACTAATTCGGTGAArGAACACTTGTAGAI r rTTAATTTCGACA /«'10 
G rAATTTTTGAGCACATTGGCGTTAGAAi rrGAAAAAAAAr.GTTGGACAGrTGAA fCCTCATAACTCrCA 770O 

7710 7720 7730 //«lc> 77SO 7 /GO ' /770 

1 1 ' 1 1 ' ' ■ ' I ' ' ' ' I ■ «-« 1 1 1 i ■ ... i ■ ... i . ... i . i 

AAA TA TTTCAGAATCCAGCGGC TACACCTCTGACGCCGGrGTTGCGATGTGCGCCAAAATGAGGGAGAAG 7770 
C I GAAAGAA rACGATGACATCAGTCGTGGAGOACACiAAOrifXTATCCTGACAAG rGA^TTTTGG IAS /H'lO 

I AG I^ G J£ G ^^CTTGACACACATATOAACACArTCGCTGCTCGTTTCGGTGGTC^GGGAIjCCATGCASC 7910 
AA ; T AATCC AGAAGG CTC AA AATT AA i'GAGCATC ACTTGG TGA rCGAGGAATCCCCGAAAGACG'!' I' TGAT 70UO 
AGCATCTTCTTC I' r rTGCATTCTTT ;' I'CTCTCTCTGCTGGGAG rCCCTGTTACACAGACATCTATTC A Td 8050 

8080 0070 8080 8090 8100 8110 81?0 

' 1 ' I ' ' ' ' I i i i i I i i i 1 1 i i i i i i_t.i 1 1 . ■ .. i .... i .... i . ... i ....[. t .. t .... t .... ) ' 

CGCGAAGTGCAAT T I I GGTTCC7AAAGATGAGAGGAGGAGGAGGAGCTATGAC I'TATGAATGAGCATCGA 3l->0 
vjGAACCGCGTGAAATAG TTTGGAGCTTAGATTGCAAATTACAG6AATATTCCGG ICiACCTCAGTC fAC TT 61«0 
^ G ^ A JJJ GGGGG CAGTTTGGTAAAAC fTGAGCCAAATTT PG ITTGGGTCACGCGTAATTTTCAAACAGrC (J2<30 
^9£IiJ?"CAACAAS TCTTGCTGGCCCAAG TC TTTCTAGAGTC IG rACCAAGCTTGC TCCAATAC 8330 
I I I I rs»GGCCAAALTTiGGSAAGAAGTTCCCGCCAAATGCATTTTTTCAAAAAT4CTAGAACCAACATC.' O'lOO 
8410 A«I20 yii'jQ 3140 8450 8460 8470 

■ ■ I I I I I I ■ I I I I 1 I I , | ! | , , , | | , , , , | , , , , | , , . , L. ... , . I I .... I .... I .. . . , , , , , j 

GG . TTACiCGCAAGTTGTCTCCTAGGAGGATATCCAAArGT i' I'ATTTAC TCTC. TCTC TTTCTAACr.AAGAT 8470 
CTA I CGGTGCATCTCA I'AATGGAACCGATGCGGTTTTGCGCGC I" rCGAAtjAGCAT I I'CTTTGCTTT i'G I r 8'J'4(') 
oCI I riTTGTTATGCCTCTGAATTTGAAACTAGTTATGAAA I' rTCAGATGTT I'CAC TGGAG A TGGCCAAG 8610 

^AAlJr-IoII^CO AGTTTCATGAATTTTGAAATTCTr ' TAGTTCT ^ 3C ATACTATAGC PC 8680 

AAAAAGG : TCv.GCT f&AGCCCTCCC TAAATTCAAAATT fCC ITTC AAGTCC I'G TCC ATTTA A I C TACCTT 8750 

0760 «//0 6730 8790 0800 88 1 0 MA) 

• l " * ' ■ 1 " i ■ ■ ■ ■ i ' ■ • ■ ■ » ■ i ■ ■ 1 1 1 1 1 1 1 1 1 ■ . 1 1 1 ■ 1 1 1. u... . i .... i .... i .... i ... r 

Ot^atJ^™^ 00 WCATTCTTC rrCGTCTCAATTTC rTTTTC'ATGTCACC -XGATCC IGTCGGT 88?0 
TTCAATC Tier rTCAATTCACCCTAGATTIGGCGC ACAAGC TCCCC 10 TC. TCTCTA \ rGCACC TG TCC T C 8090 
^TTCCAT, f ITGGCGACTTGC I' rCCTTCCCTTCCCTTCCGTCGCCGGTTTCCT I CCGATTGCC I' TGT 8980 

^::ir^ TATAAI ^attcattaagcccagaacaaacgaacgggg tttctttttc rcc rcTTAcrATCTTTT oo^o 

. OoCGuAAriGAATTCGT C r P3TGATAGTGAAG I'G TTTGTCAAGAATTTT TGG I IT fTGGTAGC I TGCCA 0100 
9! 10 9120 9130 9140 9150 OlfiO 0170 

] ' L ' l'-L ' ' ' 1 " 1 1 ■ ' ■ 1 1 ■ ■ ' ■ ' • » ■ ■ ' I ■ ' ■ ' I i i ' i I i i i i I I ' 

Y--A QII AC ;^T^ CACCATCTTGCTCAGCATTGTr ^TCAA IATCTTGGTT fTATTTCAAAATTTG T I 91 70 
^ A !T? A ^ A, ;L r ^I' jTAGGCAGai rrr CGTGGCGGGACCCACTTT7rCAAACTC. A:;cCAAAAAIGrGTr> »B>40 
n# r^A Tlrr^J^t I ^"^I'OAAC TATAACCTTAGAACGTTTTT'T TC. AAA AAT ' >' C CG A A ATC A IA r 93 It; 

/rUi?™!!*^^ 9330 
tC A KATGAC I ACtjTTCCGTtGG r^TCGCCACAAAAAfTACAGTTCC I'CGGAAGGT'" I I IG TGGCGGGGC 9450 

W60 'J470 9480 9490 9500 <)510 95?0 

■ 1 ' ' I ■ ' ' 1 1 ' i ■ ■ I i. ■ . . ^ 1 * * ' i .... i .... i .... i .... i .... i ■ | , , f | | , i ...., „,,, | ' 

aIaaa^ r^ A A f ^^SS CAAAGC , C:GGGA T6AGC rGATAAAAAATGTACT I C '/\AGGAA I A r TATQC 05,?0 
trr^^T^JAAO 0 ^! 1 '' 1 rTACrAGAA ACAGTTrAAGAAAGAAAAAAG3rTTTT- TI .AATTTAAAAA S500 
fI^^JI T I^^I*I AAAGCC/iAAr ' rT «AGTTGATCAC C I CGAG AA AAT TC AAA ..A T { Ci A A AGCC F,Vi 96GO 
rlli^TTir^ TA TCTGGAAAGCGAACTTCTAAA r TAGAAAAAC r I' A TA AAA .\AATC T A AA X G T TT 3730 
(•.AAArrTTTTATTIGAAATCCTAGr-GACTTTTrrroArTTTTCyArrTGTTTCCAAGCTAIAAGTCVM- 080< 
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9810 9820 9830 98<fO 9850 9860 \)k)/0 

1 i 1 1 i 4 1 1 1 1 1 * 1 i i i 1 1 1 i ' ' 1 • ■ i i 1 i i i i 1 i i ■ i I i i i i * i i - i 1 i i • - I i > «- » t .... i .... i 
r JAAGTCGCCTCC TCAACCTCAAACC AoTGTGCCTCCATATTTGGAACACACACAAGCAAAAACCAA i IGA 9870 
7ACTATGTGTTCGAGTAGCCACT7G ACAAGAAGAAACTTGCCGAC AC I'GG fGGCTGG TCACCATTCTCCT 9940 
CTCTTTG TCA T f TGCA rAATCTTTC TCCCTC TTCC7CA TAAATAACTAAACTGIGTuTCCTGOGf G fCCT 10010 
CCGCTCTCGAGGGGGGGCCCGGTACCCAGC TTTVG 1TCCC I* f TAG fGAGGGTTAATTGCGCGC TTGGCG I 10080 
AATCATGGTCATAGCTGTTTCC7GTGTCAAA7TGTTATCCGCTCACAATTCCACAC AACATACGAGCCGG 10150 

10160 10170 10100 10190 10200 10210 10220 

I I I I I I I t-L.Ll l! » I ». • L..l.L l_i 1 I t l l I . t i . I i i i i ) i i j . I i ■ . . I . ■ < . 1 ) . . , I . ...I. ...1 

AASCATAAAGI'G l AAAGCC TGGGG rGCCTAATGACiTGAGCTAACTCACATTAATTGCG r PGCGCTCAC TG IQW.C 
C C G C TT TC C AG TC GO G A A A CC T G T CG TGC C A GC T GC A 7 T A A f G A A PCGGCCAACGCGCGGGG AGACG^G 10~Q0 
GTTTGCGTATTGGGCGC: l'CI' fCCGCT TCCTCGCTCACTGACTCGC TGCGCTCGG7C.GTTCGGC rGCGGCfi 1036" 
AGCGG I'ATC AGCTCACTCAA AGGCGG T ^AT ACGGTTATCC AC A GA A TC AG GGG ATA AC GCAGGAAAG A AC IO'I30 
ATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCuCGTTGC TGGCG I 1*1 rrcCATAGGC 1<)bOO 

10610 10520 10530 10540 10660 10560 10570 

' ' 1 1 1 i i i i 1 * i i i I i » i » I i t i t I .... i .... I , ... i .... I .... i i i i I ■ . * . i . . . . i 

TCC6CCCCCCTGACGAGCATCAC.AAAAA7CGACGCTCAAGTCAGAGGTG6CGAAACCCGACAr,GACTAl A 1(W0 
AAQA rACCAGGCGT^TCCCCCTGGAAGCTCCCTCGTGCGC TC I'CC rG T TC C G ACCC TGCCGC *" T ACC G f-i A 108'10 
I ACC rCTCCGCCTTTCTCCCTTCGGGA AGCG l'GGCGC V I* rc TCATAGCTC ACGCTG TAGGTA IC I CAC IT 10710 
CGGTGTAGG !CG F rCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCG T rCAGCCCoACCGCTGCGOC IT 10780 
A " CC GC T A A C T A T C G r C T TG AG T C C A A C. C C G G T A A G AC AC G AC TT A TC G C C AC TG G C AG C A ' j C C AC T G G I 10850 
10880 10370 10n00 IOA00 10900 10910 10920 

' ' 1 ' 1 ' 1 1 1 I 1 1 1 ' 1 ' ' ' ' I I ' ' ' t ' « ' ' 1 t I I I,LU L.lA. t XA * ^ 111! 1 11! I 1 itlltllltl.tt.! 

AACAGGAT TAGCAGAGCGAGGTATGTAGGCGGTGC I'ACAG AGTTC TTGAAGTGGTG HOC f A AC TACGGCT \QU20 

A:aCTACAAGGACAGI AriTGGTATCTGCGCTCTGCTGAAGCCAGTTACC: I* fCGGA AA AAC AGTTGGTAG 10990 

Ci C fTGATCCGGCAAACAAACC ACCGC I'GG rAGCGGTGGTTTTTTTGTTTGCAAGC AGCAG ATTACGCGC 1 10B0 

AoAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCA 5TGGAACGAAAAC I 1 1130 

CACGTTAAG3GATTTTGGrCA( GAGAr rArCAAAAAGGATCTTCACCTAGATCC Tf !' I'AAA TTAAAAATG I 1200 

1^10 1122C 11230 112U0 11250 11260 11770 

* ' 1 1 1 ■ i » i I i i i i I i i » i I i i i i ! i i i « I .... I .... I . ■ . . I .... 1 ... . t . g , , 1 . . , . I , ( , . > 

AAGTTTTAAATCAA I'C i'AAAGTATATATGAGTAAACTTGGTCTGACAG T TACCAAT .1CTTAATCAGTGAG 1 1270 
GCACCTATC TCAGCGATCTG IX I'A I JTCG7TCATCCATAGTTGCCTGAC7CCCCGT IGTGTAGAT&AC I'A 1 13'^ 
CGATACGGSAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCA :GCTC ACCGGCTCC 1 J MO 
AGATTTATCAGCAAVAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAG i'GG rcCTG lAACTTTATCCGCC 1 H180 
ICCATCCAGTCTATTAATTG I' I'GCC GGGAAGC TAG AGT AAG TAGT TCGCC AG fTAA rAGTT TGCGCAAC.'i 1 IGCO 

II66C 11G70 M580 I 1590 H800 11610 11820 

I ' ' 1 1 1 1 1 ' I 1 1 • • « 1 t i t i I i i »■ I ... i i ...,?.... t .... 1 .... i . . . . i . ...» t _L 

T rGTTGCCA * TGCTACAGGC ATCGTGGTGTCACGC fCG rCGTTTGGTATGGCTTCA C rCAGCTCCGGTTP 1 16?0 
COAACGATC AAGCCGAGTTaCATGATCCCCCA TGT TG fGCAAAAA AGCGG TTAGC TO C 1 IT.GG PCCTCCG 1 
ATCG I* IG PC AGAAGTAAGTTGGCCGCAGTG r TATCAC I'C A rGGTTATGGCAGC ACPiC A ;*AA I rr 1X7TA I I /<>n 
C I G r CATGCCATCCGTAAGA IGC f r ITCTGTGACTGGTGAGTACTCAACCAAG TCA rrcTGAGAATAG i C 1 1030 
T ATGCGGCG ACCGAGTTGCTCTTGCCCGG^G I CAA fACGGGA TAATACCGCGC CAC. V r AGC AG A AC TT 7 A 1 1000 
^SIO 11920 11030 -1040 11950 119(50 ;:970 

II 1 1 1 1 i » l i 1 i > i i I i i i i I i i i i I i i . i I i . . ■ i . . I . i ■ . I . . ■ . I .... I . ,__,,,(, ... t .... I 

AAAG TGCTC ATCATTGGAAAACGTTCTTCG oGGCGAAA ACTC T CAAGG ATCTTACCilC ToT I'GAGA TCCA 1 V) /:) 
f:"TCGAi'G(AACCCACTCGTGCACCCAAC I'GA rcTTCAGC ATCTTTTACT T TCACCAGCGTTTCTGGG TG ^CW 
A'xjC AAAA AC AoGAAUGC AAA ATGCCGC AAA AAAGGGAATAAGGGCG AC ACGGAAA I I GAATACTCATA 1 i \ '0 
CX rTCCTTTTTCAATATTAI !GAAGCATTTATCAGCCTTAT7GTCrCArGAGCGG.\TACATATrrGAAT 1210 ; 
G A | rrAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAr 1223 7 
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pLM3 (1> 10847) Site and Sequence ( Xx^L %6\) Kage r 

Enzymes : 100 of 146 enzymes (FJ&raCI) 

Settings: Linear, Certain Sites Only. Standard Genetic Code 

gacggatcgggagatctcccgatcccctatggtcgactctcagtac aatc tgc tctgatgccgcatagttaagccagtatctg ctccctgcttgtg tgtt 

CTGCCTAGCCCTCTAGAGGGCTAGGGGATACCAGCTGAGAGTCATGTTAGACGAGACTACGGCGTArCAATTCGGTCATAGACGAGGGACGAACACACAA 
T 0 ft E 1 S R S P M V Q S Q Y N L L . C R I VK PVSAPCLCV 

GGAGGTCGCTGAGTAG ^GCGCGAGCAAAATTTAAGCTAC AACAAGGCAAGGCTTGACCGACAATTGC ATGAAGAATCTGCTTAGG GTTAGGCGTTTTGCG 
CCTCC AGCGACTCATCACGCGCTCGTTTTAAATTCGATG TTGTTCCGTTCCGAACTGGCTGTTAACGTACTTC TTAGACGAATCCCAATCCGCAAAACGC ^ 
GGR ,' V VR g QN LSYN KARLORQLHEESA.G.AFC 



CTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATTG AC TAGTTATTAA TAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATA TA 
GACGAAGCGCTACATGCCCGGTCTATATGCGCAACTGTAACTAATAACTGATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATAT 
AA SRCTG QIY ALTLI ID , LL IVlN Y G V I S S . P I Y 

TCGAGTTCCGCGTTACATAACTTACGGTA^ 

acctcaaggcgcaatgtattgaatgccatttaccgggcggaccgactggcgggttgc tgggggcgggtaac tgcagttattactgcatacaagggtatca 1400 

G V P R Y I T Y G K V P A V L T A Q R P PPIOVNNDVCSHS 

aacgccaatagggactttccattgacgtcaatgggtggactatttacggtaaactgcccacttggcagtacatcaagtgtatcatatg ccaagtacgccc 

rTGCGGTTATCCCTGAAACGTAACTGCAGTTACCCACCTGATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCGGG ^ 

nanrdfpl ts mgglf tvn cplgstssvsyakya 

CCTArTGACGTCAATGACGGTAAArGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCC TACTTGGCAGTACArCTACGTATTAGTCA 

ggataactgcagttactgccatttaccgggcggaccgtaatacgggtcatgtactggaataccctgaaaggatgaaccgtcatgtagatgcataatcagt 600 

PY ; RQ ' *. ■ MA RL ALC P V HQL MG L S YL A V H L R I S H 

TCGCTATTACCATGGTGATGCGGTTTTGGCAGTAC ATCAATGGGCGTGGATAGCGGTTTGAC tcacggggatttccaagtctcca ccccattgacgtcaa 
AGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTT ^ 
* y y , H G 0 A V L A V H Q V A V ! A V , [_ TGI S K S P P H . R 0 

TGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCC AAAATGTCGTAACAACTCCGCCCCATTGACGC AAATGGGCGGTAGGCGTGTACGGTGGGA5 

ACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTTACAGCATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTC ^ 
V E F V L A P K S T . G >- S K M S . Q L R P I 0 A N G R . A C T V G 

G TC TATATAAGCAGAGC TC TCTGGCTAACTAGAG AACCC ACTGC TTAC TG3CTTATCGAAAT TAATACGAC tcactatagggagacccaagc tggc tagc 
CAGATATATTCGTCTCGAGAGACCGATTGATCTCTTGGGTGACGAATGACCGAATAGCTTTAATTATGCTGAGTGATATCCCTCTGGGTTCGACCGATCG ^ 

r — > , 

' T7 promolor primtng ciJe ... I 

G L Y K Q S S > * N RTHCLLAYRN.YDSL.GOPSVLA 



GTTTAAACTTAAGCTrACCATGGGGGGTTCTCATCATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCTG TACGAC 

CAAATTTGAATTCGAATGGTACCCCCCAAGAGTAGTAGTAGTAGTAGTACCATACCGATCGTACTGACCACCTGTCGTTTACCCAGCCCTAGACATGCTa ^ 

K 

ProBond bmdinq doma^i -1J 1 
F * L t< lT "GGSHH HHHHGMASMTGGQOMGR0UY0 



WO 98/24810 114/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 
pLM3 (1 > 10847) Site and Sequence 

GATGACG ATAAGGTACCTAAGGATCC AG TGTGGTGGAATTCTGCAGATATCGAATTCCTGCAGCCCCT GCTCT TCAGCCAGATGC TGGACCCAGAG TCCZ 
CTACTGCTATTCCATGGATTCCTAGGTCACACCACCTTAAGACGTCTATAGCTTAAGGACGTCGGGGACGAGAAGTCGGTCTACGACCTGGGTCTCAGG- 

> ■ 



Page J 



11 



= V\S&n pLM T" 



E ORF pLM1 

D D 0 K V P K 0 P V W V N S A 0 I E F LOPLL FS0ML0PE3 

AGAGAAAGAGGACAGTGCAGAATGTCCTGGATCTCCGGCAGAACCTGGAAGAGACCATGTCCAGCC TGCGAGGG TCCCAGGTGAC TCACAGCTCCC tgga 
TCTCTTTCTCCTGTCACGTCTTACAGGACCTAGAGGCCGTCTTGGACCTTCTCTGGTACAGGTCGGACGCTCCCAGGGTCCACTGAGTGTCGAGGGACC" 



-ORF pUM1 



Q R K R T V Q N V L D L R Q N L S E TMSSLRGSQVTHSSLE 
GATGACCTGCTACGACAGCGATGATGCCAACCCACGCAGCGTGTCCAGCCTCTCCAACCGCTCGTCCCCTC tgtcatggcgcta tggccagtccag tccg 

ctactggacgatgctgtcgctactacggttgggtgcgtcgcacaggtcggagaggttggcgagcaggggagacagtaccgcgataccggtcaggtcagg: 



~~ ' — — ORFplMI — 

" T C yOS DDANPRS VSS LS NRSSPLSVRYGQSSP 

CGGCTGCAGGCTGG t'gacgcgccctctgtgggtgggagctgccgctcggaggggacgcccgcctgg tacatgcacgg cgaacgggcccactactcccaca 

GCCGACGTCCGACCACTGCGCGGGAGACACCCACCCTCGACGGCGAGCCTCCCCTGCGGGCGGACCATGTACGTGCCGCTTGCCCGGGTGATGAGGG7G- 



-ORF pLM! 



R L Q A G D A„P SVGGSCRSEGTPAVYMH G E R A H Y S H 

CCATGCCCATGCGCAGCCCCAGCAAGCTCAGCCATArCTCCCGCCTGGAGCTGGTCGAATCCCTGGACrCGGArGAGG TGGACCTCAAGTCCGGCTACA- 
GGTACGGGTACGCGTCGGGGTCGTTCGAGTCGGTATAGAGGGCGGACCTCGACCA GCTTAGGGACCTGAGCCTACTCCACCTGGAoTTCA5GCCGArGTM 

-insert plM1 



-ORF pLMl 



T » P M R S P S K L S H ! S R I £ V ^ ^ *" D ? P E V ^ L ^ jj Y f 

GAGCGAC AGTGACCTCATGGGCAAGACCATGACGGAGGA TGATGACATCACTACCGGCTGGGATGAAAGCAGC TCCArCAGfAGTSGAC TCA GCGAfGCC 
CTCGCTGTCACTGGAGTACCCGTTCTGGTACTGCCTCCTACTACTGTAGTGATGGCCGACCCTACTTTCGTCGAGGTAGTCATCACCTGAGTCGCTACG^ 



-ORF pLMl 



S 0 S D L M G K rMTEDOOtTTGVDESSSISSG 



ISO 



WO 98/24810 115/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 p age 3 

pLM3 (1 > 10847) Site and Sequence 

TCAGACAATCTC AGTTCAGAAGAATTCAATGCCAGCTCCTCACTCAACTCCCTCCCAAGTAC tcccac tgcttctcgcaggaactca&caatagtgctac 
AGrcrGTTAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGTGAGTTGAGGGAGGGTTCATGAGGGTGACGAAGAGCGTCCrTGAGTTGTrATCACjATJ 



-insert pLMl 



-ORF pLM1 



5 0 N L S S E E F N A S S S L N S L P S T PTASRRNST IVL 

GCACAGACTCAGAGAAGCGCTCACTGGCAGAAAGTGGGCTGAGCTGGrTTAGTGAATCAGAG GAGAAAGCCCCTAAAAAACTGGAGTACGACAGTGGTAG 
CGTGTCTGAGTCTCTTCGCGAGTGACCGrCTTTCACCCGACTCGACCAAATCACTTAGTCTCCTCTTTCGGGGATTTTTTGACCTCATGCTGTCACCATC 



-insert pLM1 



-ORF pLM! 



Rf DSElCR SLA ESGLS VF SES EE KAP K KLEY0SG3 

CCTGAAGATGGAACCTGGGACTTCTAAGTGGCGGAGGGAGCGGCCTGAGAGCTGTGATGAT TCATCCAAGGGTGGAGAACTGAAAAAGCCCATCAGCCTG 
GGACTTC TACCTTGGACCC TGAAGATTCACCGCCTCCCTCGCCGGACTCTCGACACTACTAAGTAGGTTCCCACC TCTTGAC TTTTTCGGGTAG TCGGAC 



-insert pLMl 



-ORF pLMl 



L K M E P G T S KWRRERPESCDDSSKGGELKKP ISL 



GGCCACCCTGGTTCCCrGAAGAAGGGCAAGACCCC ACCTGTGGCTGTAACTTCCCCCATCACTCACACAGCCCAGAGTGCCCTCAAAGTCGCAGGCAAAC 

1 (■■■■■>■ 1 . 11 . | , . 1 . t 1 . ■ ^ , 1 „> 1 , . . 1 ... l ii, ; t \ }••*}£»* 

CCGGTGGGACCAAGGGACTTCTTCCCGTTCTGGGGTGGACACCGACATTGAAGGGGGTAGTGAGTGTGTCGGGTCTCACGGGAGTTTCAGCGTCCGTTTG " ' 



-insert pLMl 



-ORF pLMl 



G H p G S L K KGK TPPVAV TSP I THTA05ALKVAGK 



C «'G AGGGCmAAGCTAC AGACAAGGG TAAGC TTGC AGTGAAGAAT ACTGGGCTCCAACGCTCC TCC TC TGATGC TGG KGGG ACCGCC TG AG TGATG : 7 A A 
GACTCCCGTTTCGATGTCTGTTCCCATTCGAACGTCACTT C TTA TGACCCGAGGr TGCGAGGAGGAGACTACGACCAGCCCTGGCGGAC TC ACTAC'JA T T ' 

-insert pLMl 



-ORF pLM1 



P £ G K A T °, * G < > A V K N T G L Q R S 5 SOAGRDRLSDA> 

GAAGCCCCCCTCGGGCATTGCTCGCCCC ^ccacttcgggatccttcggct acaagaagcctcctcctgccacag gcacagcc ACTGTCATGC AAAC TGG 7 
CTTCGGGGGGAGCCCGTAACGAGCGGGGAGGrGAAGCCCTAGGAAGCCGATGTTCTTCGGAGGAGGACGGrGTCCGTGTCGGTGACAGTACGTTTGACCA 



' " -ORFpLMl — 

K P P . S G 1 ARPSTSGSFGYKKPPPATGTA TVMQTG 



WO 98/24810 116/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 p ^ 

pLM3 (1 > 10847) Site and Sequence 

GO FTC AGCCACTCTCAGCAAGATCCAGAAGTCCTC AGGCATCCCTG TCAAGGCAGTAAATGGG CGCAAGAC TAGCTTAGATGTTTCC AACAGCGCAGAGI.' 
CCAAG rCGGTGAGAGTCGTTCTAGGTCTTCAGGAGTCCGTAGGGAC AGTTCGGTCATTTACCCGCG TTC TGaTCGAATCTACAAAGG fTGTCGCGTCTCU '"^ 



-insert pLMl 



-ORF pLMl 



CSA TLSKI QK SSC IP VKPVN GR K T S L 0 V S N S A E 

C AGGATTCCTGGCTCC TGGAGCCCG TTCTAACATCCAGTACCGCAGCCTGCCCCGGCCAGCCAAGTCAAGTTC rA TGAGCGfGACCGGCGGGCGGGGr«jc 
GTCCTAAGGACCGAGGACCTCGGGCAAGATTGTAGGTCArGGCGTCGGACGGGGCCGGTCGGTTCAGTTCAAGATACTCGCACTGGCCGCCCGCCCCAC: 



-insert pLM1 



-ORF pLMl 



P G F L A P G A R S N 1 Q Y R S L P R P A K S S S M SVTGGRGG 

ACCTCGCCCTGTGAGCAGCAGCATTGACCCCAGTCTCCTCAGCACCAAGCAGGGAGGCCTTACGCCTTCCAGACTGAAGGAG CCT^^ 
TGGAGCGGGACACTCGTCGTCGTAACTGGGGTCAGAGGAGTCGTGGTTCGTCCCTCCGGAATGCGGAAGGTCTGACTTCCTCGGATGGTTCCATCGGTCA 



-insert pLM1 



— —ORF pLMl 

P R P V S S S j 0 P S L L S T K Q C G I T P S R L K E P T k V A 3 

GGGCGGACCACTCCAGCCCCTGTCAATCAGACAGATCGGGAAAAGGAGAAGGCCAAAGCCAAGGCAG TGGCCTfGGAC TC AGAC AACATCTCCfTGAAGA 
CCCGCCTGGTGAGGTCGGGGACAGTTAGTCTGTCTAGCCCTTTTCCTCTTCCGGTTTCGGTTCCGTCACCGGAACCTGAGTCTGTTGTAGAGGAACTTCT ^ 



-ORF pLMl 



g * T , T P A P V " Q . T 0 R E K E K A K A K A V A L 0 5 0 N I S L > 

GTATTGGCTCCCCAGAGAG TACTCCCAAGAACCAAGC AAGCCACCCCACAGCCACCAAGC TGGCAGAGC TGCC ACCAAC CCC TC TCAGGGCC AC AGCG.' 
CATAACCGAGGGGrCTCTCATGAGGGTTCTTGGTTCGTTCGGTGGGGTGTCGGTGGTTCGACCGTCTCGACGGrGGTTGGGGAGAGTCCCuG TG TC GC T ' 



-insert pLMl 



-ORF pLMl 



3(GSP£ S TP< N3ASH PT A TKLAELPPTPLRATA* 

GAGCTTTGTCAAACCACCCrCACTAGCCAATCTTGACAAGGTCAACTCCAACAGTCTGGATCTACCATCA rCCAGTGATACCACCCATGCTTCAAAGGT: 
CTCGAAACAGTTTGGTGGGAGTGArCGGrTAGAACTGTTCCAGTTGAGGTTGTCAGACCrAGATGGTAGTAGGTCACTATGGT GGGTACGAAGTTTCCAr: 

- insert pLMl 



-,v- 



" ORF pLMt 

X P P S L ANLOK VNSNSL0LPSSS0TTHA3J V 



WO 98/24810 117/270 PCIYEP97/06956 



Tuesday. 18 November 1997 13:58 
PLM3 (1 > 10847) Site and Sequence 

CCAGATCTGCATGCTACAAGCTCAGCATCTGGGGGCCCTCTCCCTTCCTGCTTCACCCCCAGTCCGGCA CCCATCCTCAATATTAACTCAGCCAcCTTC T 
.vi^CTAGACGTACGATGTTCGAGTCGTAGACCCCCGGGAGAGGGAAGGACGAAGTGGGG GTCAGGCCGTGGGTAGGAGTTATAATTGAGTCGGrCGAAGA 

-insert pLM1 



Pagi 



-ORF pLMI 



* 0 L H A T S SASGGPLPSCFTPSPAP ILN IN 



3 A S F 



CCrAGGGCCTGGAGCTAATGAGTGGTTrCAGTGTGCCAAAAGAGACCCGCATGTACCCCA AACTCTCAGGCCTGCACAGGAGCATGGAGTCCCTCCAGAT 
GG3TCCCGGACCTCGATTACTCACCAAAGTCACACGGTTTTCTCTGGGCGTACAT GGGGTrTGAGAGTCCGGACGTGTCCTCGTACCTCAGGGAGGTCT^ 

•insert pLMl 



-ORFpLMl 



S Q G I E L MSG F S V P K E T RM Y P K L S G L H RSMESLOM 

GCC AATGAGCCTCCCCAGTGCCTTCCCCAGCAGTACTCCCGTCCCCACCCCACCTGCTCCCCCTGC TGCTCCC ACAGAAGAAGA GACGGAAGAGCTGAC7 
CGGTTACTCGGAGGGGTCACGGAAGGGGTCGTCATGAGGGCAGGGGTGGGGTGGACGAGGGGGACGACGAGGGTGTCTTCTTCTCTGCCTTCTCGACTGA 



31 OC 



-insert pLMI 



-ORF pLM1 



P MS LPS AFP S5 T P VP TPPAPPAAPTEEETEEL T 

T'j^ AG TGGAAGCCCCAGAGCTGGGCAAC TGGACAG TAATCAGCGGGATCGGAACACTC TTCCCAAGAAAGGGCTC AGGTACC AGC TTCAGTCCCAGGAGo 
ACZTCACCTTCGGGGTCTCGACCCGTTGACCTGTCATTAGTCGCCCTAGCCTTGr GAGAAGGGTTCTTTCCCGAGTCCATGGTCGAAGTCAGGGTCCTC: 

-insert pLMI 




-ORF pLMI 

-SGSPRAGQLOSNQRORNT 



LPKKGLRYQLQSQE 



A'jACC AAGGAGAGGCGACATTCCCATACCATTGGTGGGC rGCCTGAATCCGATGACCAGrCAGAGCTGCCTTC TC CCCCTGC ACTTCCCAfGTC TC TG.c^ 

"C "OGTTCC TCTCCGC TGTAAGGGTATGGTAACCACCCGACGGACTTAGGC TACTGGTCAGTCTCGACGGAAGAGGGGGACGTGA AGfiilTAfiV:; T" J "'"'' 



-ORF pLMl 



Erk£RRH SHr ^GG'-PES DDQ SELPSPP A L P M S L 

to:aaagggccaacttaccaacatagtgagtcccactgcggccaccacgccaagaatcacccgctccaacagcatccccacccacgaggcggccttc 

AC3TTTCCCGGTTGAATGGTTGrATCACTCAGGGTGACGCCGGTGGTGCGGTTCTTAGTGGGCGAGGTrGrCGTAGGGGTGGGTGCTCCGCCGGAAoCTC ^ 



-insert pLM1 



" ORF pLMl — Z, 

" * G Q 1 T , N 1 V S P TAATTPRITRSNS IPTHEAAFE 



WO 98/24810 



118/270 



PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 
pLM3 (1 > 10847) Site and Sequence 



Page b 



C TGTACAGCGGCTCCCAAATGGGGAGCACCCTGTCCCTGGCCGAGAGACCCAAGGGA ATGAT rCGGTCAGGATCCTTCCGAGACCCCACGGACGATGTTL' 
G^CATGTCGCCGAGGGTTTACCCCrCGTGGGACAGGGACCGGCTCTCTGGGTTCCCTrACTAAGCCAGTCCTAGGAAGGC TC TGGGG TGCC TGC TACAAU 



- insert pLMl 



-ORF pLMl 



L YS GSQMGST L SI AE RPKGM I RSGSFROPTDDV 

AC3GCTCAGTGCTGTCCCTGGCCTCCAGTGCCTCCTCCACCTACTCCTCAGCTGAGGAGAGGA TGC AA TC TGAGC AAA TCCGGAAGC TTCGTAGGG AAC " 
TGCCGAGTCACGACAGGGACCGGAGGTCACGGAGGAGGTGGATGAGGAGTCGACTCCTCTCCTACGTTAGACTCGTTTAGGCCTTCGAAGCATCCCTTGA 



36 y 



-insert pLMl 



-ORF pLM1 



H G S V L S L A S S A S S T Y S S A E ERMOSEQIRKLRREL 

GGAATCATCCCAGGAAAAAGTGGCCACCrTGACGTCTCAGCTTTCTGCCAATGCTAATCTGGTGGCTGC TTTTGAGCAGAGCCTGGTGAATATGACATCC 
CCTTAGTAGGGTCCTTTTTCACCGGTGGAACTGCAGAGTCGAAAGACGGTTACGATTAGACCACCGACGAAAACTCGTCTCGGACCACTTATACTGTAGG 



J7CC 



-ORF pLM1 



E S 5 Q E K V A T L T SQ L SANANLVAAFEQSLVMM T 



CGCCTGCGACACCTGGCAGAGACGGCCGAGGAGAAGGACACTGAGCTGCTGGATTTGCGAGAA ACCATAGACTTTCTGAAGAAAAAGAACTCTGAGGCCC 
GCGGACGCTGTGGACCGTCTCTGCCGGCTCCTCTTCCTGTGACTCGACGACCTAAACGCTCTTTGGTATCTGAAAGACTTCTTTTTCTTGAGACTCCGCiG 




-ORF pLMl 

I L I , H L A *f T A E E K 0 T E L L 0 L R E f 10 F LKKKNSEA 

A'UGCAGTCATTC AGGGAGCCCTTAATGCCTCAGAAACCACACCCAAAGAAC TTCGGATCAAGAGAC AAAAC TCCTCAGATAGCATCTCAAGCC TCAACAIi 
KCGT:A3TAAGTCCCrCGGGAArrACGGAGTCTTTGGTGTGGGTTTCTTGAAGCCTAGTTCrCTGTTTTGAGGAGTCTATCGTAGAGTTCGGAGTrGT: 



-insert pLMl 



-ORF pLM1 



Q A I ; 0 G A > N A S E T T P K £ L R I K R Q N S SDS I S S L M > 

CATCACT AGCCATTCC AGCATCGGCAGC AGCAAGGATGC TGATGCGAAAA AGA AGAAAAAAA AGAG TTGGGTC TATGAGCTTCGAAGTTCC TTC AACAAA 
GTAGTGATCGGTAAGGTCGTAGC CGTCGTCGTTCCTACGAC TACGCTTTTTCTTCTTTTTTTTCTC AACCC AGATACTCGAAGCTTCAAGGAAGTT3TT* 




3 H S S f GSSK DA0AKKKKKK5VVYELR5SFn» 



WO 98/24810 119/270 PCT/EP97/06956 



* Tuesday. 18 November 1997 13:58 Page 1 
pLM3 (1 > 10847) Site and Sequence 

GCaTTCAGTATAAAAAAGGGGCCCAAGTCAGCTTCCTCATACrCGGATATAGAGGAGATTGCTACACCCGACTCTTCAGCCCCCTCATCCCCCAAACTA- 

1(11 — * ' 1 1 1 ' 1 ' ' ' ' 1 ' 1 ' t d J C'.' 

CGCAAGTCATATTTTTTCCCCGGGTTCAGTCGAAGGAGTATGAGCCTATArCTCCTCTAACGATGTGGGCTGAGAAGTCGGGGGAGTAGGGGGTTTGATu 



•insert pLM1 



-ORF pLM1 



AFS IKKGP KS ASS Y S DIEEIATPDSSAPSSPKL 

AGCATGGTTCCACAGAGACTGCTTCACCCTCCATCAAGTCCTCCACCTTGTCCTCCGTGGGCACTGATGTCACCGAGGGCCCTGCTCACCCAGCCCCCCA 

1 1 . 1 . . 1 i— . 1 1 1 . i ' ' ■ i -20C 

TCGTACCAAGGTGTCTCTGACGAAGTGGGAGGTAGTTCAGGAGGTGGAACAGGAGGCACCCGTGACTACAGTGGCTCCCGGGACGAGTGGGTCGGGGGGT 



■insert pLMi 



-ORF pLMi 



Q H G S T E T A S P S I K S STLSSVGTDVTEGPAHPAPH 

CACTAGGCTGTTCCATGCAAATGAGGAGGAGGAGCCAGAGAAGAAGGAGGTATCGGAGCTGCGCTCTGAGCTATGGGAGAAGGAAATGAAGCTTACAGAC 

1 1 — 1 1 I ■ i ■ i i i i i ■ i i ■ ♦ 43C\* 

GTGATCCGACAAGGTACGTTTACTCCTCC7CCTCGGTCTCTTCTTCCTCCATAGCCTCGACGCGAGACTCGATACCCTCTTCCTTTACTTCGAATGTCTG 



•insert pLMi 



-ORF pLMi 



T R L F H A N E E E E P E K K E VSELRSELVEKENKLTD 

ATCCGCTTGGAGGCCCTCAACTCTGCCCACCAACTGGATCAGCTTCGGGAGACCATGCACAACATGCAGT TGGAGGTGGACCTGCTGAAAGCAGAGAATG 
TAGGCGAACCTCCGGGAGTTGAGACGGGTGGTTGACCTAGTCGAAGCCCTCTGGTACGTGTTGTACGTCAACCTCCACCTGGACGACTTTCGTCTCTTAC 



-insert pLM 1 



-ORF pLMi 



t R L E A LNS AH QLOQLRETNHNNOLEVDLLKAEN 

ACCGACT5AAGGTAGCCCCAGGCCCCTC ATCAGGCTCCACTCCAGGGC AGGTCCCTGGATC ATC TGCATTATC TTCCCCACGCCGCTCCCTAGuCC TGCiC 
TGGCTGACTTCC ArCGGGGTCCGGGGAGTAGTCCGAGGTGAGGTCCCGTCC AGGGACCTAGTAGACGTAATAGAAGGGGTGCGGCGAGGGATCCGGACCG 



-insert pLM1 



-ORF pLMi 



0 R L K V A P G P S S GSTPGQVPGSSALSSPRR3LGLA 

ACTCACCCATTCCTTCG GCCCCAGTCTTGCAGACACAGACCTGTCACCCATGGATGGCATCAGTACTTGTGGTCCAAAGGAGGAAGTGACCCTCCGGGTG 

1 — i i i i 1 i i i i i i , > ■ — . 1 ■ i coo* 

tgagtgggtaaggaagccggggtcagaacgtctgtgtctggacagtgggtacctaccgtagtcatgaacaccaggtttcctccttcactgggaggccca: 



-insert pLM1 



ORF pi Ml — 

L T MSFGPSLAOTDLSPMOG I STCGPKEEV I" L P 



WO 98/24810 120/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 

pLM3 (1 > 10847) Site and Sequence a ___ 

GTQGTGAGGATCCCCCCGC AGCACATCATCAAAGGGGAC TTGAAGCAGCA3GAAT TCTTCCTGGGC TGTAGCAAGGTCAGTGGAAAAGT7GACTGGAAGA 
CACCACTCCTACGGGGGCGTCGTGTAGTAGTTTCCCCTGAACTTCGTCGTCCTTAAGAAGGACCCGACATCGTTCCAGTCACCTTfTCAACTGACCTTCT 



insert pLM1 - 

ORFpLMI ~ 

VVRHPPQHI IKGOLKQQEFF LGCSKVSGKVDVk 

TGCTGGATGAAGCTGTTnCCAAGTGTTCAAGGACTATATTTCTAAAATGGACCCAGCCTCTACC CTGGGACTAAGCACTGAGTCCATCCATGGCTACAG 
ACGACCTACTTCGACAAAAGGTTCACAAGTTCCTGATATAAAGATTTTACCTGGGTCGGAGATGGGACCCTGAT TCGTGAC TC AGG T AG G T ACC GA TG TC 

insert pLM1 ■ 1 — ■ 



— ORF pLM1 

M L P E A V F Q V.F K 0 Y I S KMOPASTLGLSTES IHGYS 

CATCAGCCACGTGAAACGAGTGTTGGATGCAGAGCCCCCCGA6ATGCCTCCTTGCCGTCGAGGTGTCAATAACATATCAGTC TCCCTCAAAGGTCTGAAG 
GTAGTCGGTGCACTTTGCTCACAACCTACGTCTCGGGGGGCTCTACGGAGGAACGGCAGCTCCACAGTTATTGTATAGTCAGAGG GAGTTTCCAGACTTC 

insert pLM1 

ORFpLMI — 

I S H V K ft V L D A £ P P E H PPCRRGVNNISVSLKGLt. 

GAGAAATGCGTCGACAGCCTGGTGTTCGAGACGCTGATCCCCAAGCCGATGATGCAGCACTACATA AGCCTCCTGCTGAAGCACCGGCGCCTCGTCCTCT 
CTCTTTACGCAGCTGTCGGACCACAAGCTCTGCGACTAGGGGTTCGGCTACTACGTCGTGArGTATTCGGAGGACG ACTTCGTGGCCGCGGAGCAGGAGA ^ 

insert pLMI — 



— — ORFpLMI — 

EK C VOSLVFETL IPKPMMQHY ISLLLKHRPLVL 



CGGGCCCCAGCGGC ACGGGC AAGACC TACCTGACC AATCGC TTGGCCGAGTACCTGGTGGAGCGC TC TGGCCGTGAGGTCAC AGAGGGC ATCGTCASCAl" 

' '• ■»»■ » ■ 1 ■ i i 1 i , i ^ 

GCCCGGGGTCGCCGTGCCCGTTCTGGATGGACTGGTTAGCGAACCGGCTCATG6ACCACCTCGCGAGACCGGCA CTCCAGTGTCTCCCG"AGCAGT-GTu 

insert pLMI 

— -ORFpLMI — 

S G P S G T G K T Y L T N ft L A E V L VERSGREVTEGl VST 

CTTCAACATGCACCAGCA GTCTTGCAAGGATCTGCAACTGTATCTTTCCAACCTAGCCAACCAGATAGACCGGGAAACAGGAATTGGGGATGTGCCCCT^ 
GAAGTTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGACATAGAAAGGTTGGATCGGTrG GTCTATCTGGCCCTTTGTCCTTAACCCCTACACGGGGAC 

insert pLMI — 



' — -ORFpLMI 

M M , H Q 0SCK0L0L YLSNLANQ IDRE TG IGOVP L 



WO 98/24810 121/270 PCTVEP97/06956 



Tuesday, 1 8 November 1 997 1 3:58 PaaQ C 

pLM3 (1> 10847) Site and Sequence 

GTGAI"TCTATTGGATGACCTGAG TGAAGCAGGCTCCATCAGTGaGTTGGTC AATGGGGCCCTCACC TGCAAG TATCATAAAfG TCCCTArATTA"A3GTA 
CACTAAGATAACCTACTGGACTCACTTCGTCCGAGGTAGrCACrCAACCAGTTACCCCGGGAGTGGACGTrCATAGTATTTACAGGGATATAATATCCAT 



-insert pLM1 



-ORF pLMl 



V 1 L L D 0 L S £ A G S I S S L V N G A L TCKYHKCPY I I G 

CCACCAATCAGCCTGTAAAAATGACACCCAACCATGGCTTGCACTTGAGC TTCAGGATG TTGACCTTCTCCAACAACGTGGAGCC AGCC AATGGCTTCCT 
GGrGGrTAGTCGGACATTTTTACTGTGGGTTGGTACCGAACGTGAACTCGAAGTCCTACAACTGGAAGAGGTTGTTGCACCTCGGTCGGTTACCGAAGGA 



-insert pLMl 



-ORF pLMl 



T T N Q P V K M T P N H G L H L S F R ML TFSNNVEPANGFL 

GGTTCGTTACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCGACATCAATGCCAA CAAGGAAGAGCTGCTTCGGGTGCTCGACTGGGTACCCAAGCT37GG 
CCAAGCAATGGACTCCTCCTTCGACCATCTCAGTCTGTCGCTGTAGTTACG6TTGTTCCTTCTCGACGAAGCCCACGAGC TGACCCATGGG TTCGACACC 



-insert pLMl 



-ORF pLMl 



V R Y L R R K L V E SO 5 D I N A N K E E L L R V L 0 V V P K L V 

TATCATCTCCACACCTTCCTTGAGAAGCACAGCACCTCAGACrTCCTCATCGGCCCTTGCTTCTTTCTGTCGTGTCCCAT TGGCATTGAGGACTTCCGGA 
ATAGTAGAGGTGTGGAAGGAACTCTTCGTGTCGTGGAGTCTGAAGGAGTAGCCGGGAACGAAGAAAGACAGCACAGGGTAACCGTAACTCCTGAAGGCCT 



-insert pLMl 



-ORF P LM1 



Y H L H T F L E K H S T S D F L 1 G P C F F L SCPIGIEOFR 

CCTGGTTCATTGACCTGTGGAACAACTCTATCATTCCCTATCrACAGGAAGGAGCCAAGGATGGGATAAAGGTCCATGG ACAGAAAGCTGCTTGGGAGGA 
GGACCAAGTAACTGGACACCTTGTTGAGATAGTAAGGGATAGATGTCCTTCCTCGGTTCCTACCCTATTTCCAGGTACCTGTCTTTCGACGAACCCrCCr 




-ORF pLM1 

f V F 1 0 L v N N S 1 i P Y I Q E G A K 0 G IKV HGQKAAVED 

CCCAG TGGAATGGG TCCGCGACACACTTCCCTG3CCATC AGCCCAACAAGACCAATCAAAGC tgtaccacc tgccccca ccc ACCGTGGGCCC FCACAGC 
GGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCQGTAG TCGGGTTGTTCTGGTTAGTTTCGACATGGTGGACGGGG GTGGGTGGCACCCGGGAGT3TCG 

-insert pLMl 



— ORFpLMI ■ 

PVE VVR QTLP UP5AQODQSKLYHLPPPTVGPH.3 



WO 98/24810 122/270 PCI7EP97/06956 



Tuesday. 18 November 1997 13:58 p lG 

PLM3 (1> 10847) Site and Sequence 9 

ATTGCCTCACCTCCCGAGGATAGGACAG TCAAAGACAGC ACCCCAAGTTCTCTGGACTCAGATCCTC TGATGG CC A TGCTGC TGAAAC T TC AAGAAGC T2 
TAACGGAGTGGAGGGCrCCTATCCTGTCAGTTTCTGTCGrGGGGTTCAAGAGACCTGAGTCrAGGAGACTACCGGTACGACGACTTTGAAcTrCTTCGA^ ^ 



-ORF pLMl 



I AS PPEDR T V KDS TPSSLOS DPL MAML L K L Q E A 

CCAAC TACATTGAGTC TCCAGATCGAGAAACCATCCrGGACCCCAACCTTCAGGCAACACTTTAAGGG TTCGGCAATCAC TGTCACCCCCGGACAGCAGA 
GGTTGATGTAACTCAGAGGrCTAGCTCTTTGGTAGGACCTGGGGTTGGAAGTCCGTTGTGAAATTCCCAAGCCGTTAGTGACAGTGGGGGCCTGTCGTC' 



■insert pLMl 



ORFpLMl 

A N Y [ E S P 0 R E T 1 L D P N L Q A T L . G F G N H C H P R T A £ 

ACGCTGGCATCAGCTA ^CTTAGCTCCTCCTCTCCCCTCTCCTCTTTCAGAGCACTGGCTCTCCAGCCCCAGGAGGAGAACAGGAGGGAGGAGGAG A TGAA 
TGCGACCGTAGTCGATAGAATCGAGGAGGAGAGGGGAGAGGAGAAAGTCTCGTGACCGAGAGGTCGGGGTCCTCCTCTTGTCCTCCCTCCTCCTCTACTT ° M 



insert pLMl — — 

R VH QLS .LLLSP LLFQSTGSPAPGGEQEGGGOE 



AGAGGAGGGACAGGTTCTTGGTGCTGTACCTTTGAGAACTTCCTAGGAAGGAATGGTGGGGTGGCGTTTGGG AACTTGTGCCCCCTAAACACArTTACT J 
TCTCCTCCCTGTCCAAGAACCACGACATGGAAACTCTTGAAGGATCCTTCCTTACCACCCCACCGCAAACCCTTGAACACGGGGGATTTGTGTAAATGAL ^ 



•insert pLMl 



aGG , TG SVC CT F£N FL GRNG GVAFGNLCPLNTFT 
GCZTCCTCTAArGACTTTGGGGAAAAGATGATTCTGGGTCTTTCCCTTGACTTCTTGTTTCAATTACAAACTCCTGGGCT 

CGGAGGAGATTACTGAAACCCCTTTTCTACTAAGACCCAGAAAGGGAACTGAAGAACAAAGTTAATGTTTGAGGACCCGAAAGACCCCTCCCCAAGrCT" 



•insert pLMl 



G L 1 ; • LVG KDOSGSFP.LLVSITNS 



V A F V G G J 0 > 



AAC ATCAAAACACTGC AGC AGTTCCCCGGAATTCAGCTTGGACTTAACCAGGCTGAAC TTGC TCAAAAGAAGCCGAATT CCAGCACACTGGCGoCCGTTA 

X XQ TAGT TTTGTGACG TCG ^CAAGGGGCCTTAAGTCGAACCTGAATTGGTCCGACTTGAACGAGTTTTC TTCGGCTTAAGGTCG TGTGACCGCCSGCAA" ^ 



•insert pLMl 



JSICHC S SSPEFSLOLTRLNLLKRSR 



P A H V 4 P L 



CTAGTTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCC CTCCCCCGTGCCTTCCTTGAC 
GATCAAGATCTCCCGGGCAAATTTGGGCGACTAGTCGGAGCTGACACGGAAGATCAACGGTCGGTAGACAACAAACGGGGAGGGGGCACGSAAuSAACT-: 

=> 



v i e g p fkpaoqprlcllvashllf 



A P P P C I P 



CCTGGAAGGTGCC ACTCCCAC TGTCC TTTCCTAATAAAA TGAGGAAAfTGC ATCGCATTG XC TGAGTAGG FGTCAT TC TA TTC TGGGGCGr3G33 TGGGJ 
CGACCTTCCACGGrGAGGGTGACAGGAAAGGATTATTTTACTCCTTTAACGrAGCGTAACAGACTCATCCACAGTAAGATAAGACCCCCCACCCCACCC: " ' ''' 
P 1 K V P L P L 5 F . P " < " * KLHRIV.VGVILFWC '/ 7 * 0 



WO 98/24810 123/270 PCIYEP97/06956 



Tuesday. 18 November 1997 13:58 on 
PLM3 (1 > 10847) Site and Sequence * 

CAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGC GGAAAGAACCAGCTGGGGCTCTA 
GTCCTGTCGTTCCCCCTCCTAACCCTTCTGTTATCGTCCGTACGACCCCTACGCCACCCGAGATACCGAAGAC TCCGCCTTTC TTGGTCGACCCCGAGAT ^ 
R TA RGR I GK T [ AG ML G H R u A L V LLRRKEPAGAL 

GGGGGTATCCCCACGCGCCCTGTAGCGGCGCATTAAGC6CGGCGGG TG TGGTGGTTACGCGCAGCGTGACCGCTACA CTTGCCAGCGCCCTAGCGCCCGC 
CCCCCATAGGGGTGCGCGGGACATCGCCGCGTAATTCGCGCCGCCCACACCACCAATGCGCGTCGCACTGGCGATGTGAACGGTCGCGGG^ 
G G [ , P T R P V A * H . A R R V V V L R A A.PLHLPAP.RP 

TCCTTTCGCTTTCTTCCCTTCCTTTC ^CGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGCATCCCTTTA GGGTTCCGATTfAGTGCTTTA 
AGGAAAGCGAAAGAAGGGAAGGAAAGAGCGGTGCAAGCGGCCGAAAGGGGCAGTTCGAGATTTAGCCCCGTAGGGAAATCCCAAGGCTAAATCACG 
ILSLSSLPF S PR S P A F P V K I . 1 G A S L . G S 0 L V L V 

CGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGrGGGCCATCGCCCTGATAGACGGTTTTTCGCCC 

GCCGTGGAGCTGGGGTTTTTTGAACTAATCCCACTACCAAGTGCATCACCCGGTAGCGGGAC TATCTGCCAAAAA6CGGGAAACTGCAACCTCAGGTGCA *** 



G T S T P K N L 1 R V MVHVVGHRPORRFFAL 



R V S P R 



TCTTTAATAGTGGACTCTTGTTCCAAAC TGGAACAACAC yCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGGGGAT TTCGGCCTATTG 
AGAAATTATCACCTGAGAACAAGGTTTGACCTTGTTG TGAGTTGGGATAGAGCCAGATAAGAAAACTAAATATTCCCTAAAACCCCTAAAGCCGGATAAC ^ 
S L 1 . V D S C S K L E Q H S T L S R S 1 L L I Y K G F V G F R P [ 

GTTAAAAAATGAGCTGATTTAACAAAAA7TTAACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTC CCCAGGCTCCCCAGGCAGGC 
CAATTTTTTACTCGACTAAATTGTTTTTAAATTGCGC 72C < 

VVKVPRLPRQA 



G K M S.FNKNLTRINSVECVSVR 



AGAAGTATGCAAAGCATGCATCTCAATTAGTCAGC AACCAGGTGTGGAAAGTCCCCAGGC TCCCCAGCAGGCAGAAGTATGCAAAGCATGC ATC TCAATT 

tcttcatacgtttcgtacgtagagttaatcagtcgttggtccacacctttcaggggtccgaggggtcgtccgtcttcatacgtttcgtac^ ^ 

E VC K AC I S I SQ QP GVE SP0APQ0AEVCKAC1SI 



AGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCrCCGCCCCATGGCTGACTAATTTTTT 



TTAT 



tcagtcgttggtatcagggcggggattgaggcgggtagggcggggattgaggcgggtcaaggcgggtaagaggcggggtaccgactgatIaaaaaaaata ?liCC 

S ^ ^ . ^ ' SRP . L RPS RP . LRPV PP I L R P M A D . F F L 
TTATGCAGAGGCCGAGGCCGCCTCrGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGC TCCCGGGAGCT 

aatacgtctccggctccggcggagacggagactcgataaggtcttcatcactcctccgaaaaaacctccggatccgaaaacgtttttcg^ 7S0c 

FM QRppp p(_ PLS Y SRS S E £ A F L EA . AFAKSSREL 

TGTArATCCATT TTCGGATCTGATCAAGAGACAGGATGAGGATCGTTrCGCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTG G 
ACATATAGGTAAAAGCCTAGACTAGTTC ^^ G ^GCTACTCCTAGCAAAGCGTACTAACTTGTTCTACCTAACG TGCGTCCAAGAGGCCGGCGAACCCACC '** 
V V P F S ° > 1 « * Q D E 0 R F A .LNKMOCTOVLRplgv 



AGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTG 



tcaa 



CCGACGAGACTACGGCGGCACAAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTT 
» G V S A H T G H N R Q S A A L M P P C S G C Q R R G A R F F L S 

GACCGACCTGTCCGGTGCCCTGAATGAAC TGCAGGACGAGGC AGCGCGGC TATCGTGGCTGGCC ACGACG3GCGT TCC TTGCGCAGCTG TGC TCGACGTT 

CrGGCfGGACAGGCCACGGGACTTACTTGACGTCCTGCTCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAl ^ 
P * 9 P V . • H N C R T RQR G YRGV P R R A F L A 0 L C S T L 



WO 98/24810 124/270 PCT/EP97/06956 



Tuesday. 18 November 1997 13:58 

PIM3 (1> 10847) Site and Sequence Pa 9 e »* 

G TC AC TG AAGCGGG AAGGG AC TGGCTGCTATTGGGCGAAGTGCCGGGGCAGG A TCTCCTGTCATCTCACCTTGCTCCTGCCG AG A AAGTATCC A TCATGG 
C AGTGACTTCGCCCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGAC AG tagagtggaacgaggacggctctttcataggtagtacc 7 * y 
3 > K . R E G . T G C Y A K C R G ft I S C H L TLLL P R K Y P s V 

CTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGC ACGTACTCGGAT GGAAGC 

GACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGArGGACGGGTAAGCTGGTGGTTCGCTTTGTAGCGTAGCTCGCTCGTGCATGAGCCTACCTTCG ^ 
LMQ CGGC t R L I RLPA H5TTK RN I A S S E H V L G V K 

CGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGA T 

gccagaacagctagtcctactagacctgcttctcgtagtccccgagcgcggtcggcttgacaagcggtccgagttccgcgcgtacgggctgccgctccta 3,0<: 

" ! L S ' ; M ' V T * S IRGSR QPNCSPGSRR ACPTAR | 

ctcgtcgtgacccatggcgatgcctgc ttgccgaatatcatggtggaaaatggccgcttttctggattcatcgactgtggccggctggg tgtggcggacc 
gagcagcactgggtaccgctacggacgaacggcttatagtaccaccttttaccggcgaaaagacctaagtagctgacaccggccgacccacaccgcctgg 82C< 

S S . P " A M P A C R I S W W K M A A F L D S S T V A G V V w R T 

gctatcaggaca tagcgttggctacccgtgatattgctgaagagcttggcggcgaatgggctgaccgcttcctcgtgctttacgg tatcgccgctcc cga 
cgatagtcctgtatcgcaaccgatgggcactataacgacttctcgaaccgccgcttacccgactggcgaaggagcacgaaatgccatagcggcgagggct 33CC 

A ' R T • ft W L P V ' >- L K S L A A N G L r A S S C F T V S P L P 

ttcgcagcgcatc gccttctatcgccttcttgacgagttcttctgagcgggactctggggttcgaaatgaccgaccaagcgacgcccaacctgccatcac 
aagcgtcgcgtagcggaagatagcggaagaac tgctcaagaagactcgcccigagaccccaagctttactggctggticgctgcgggttggacggtagtg 

' R S A S P S 1 A F ■- T S S S E R D S G V R N 0 R P S p a Q P A I T 

gagatttcgattccaccgccgccttcta tgaaaggttgggcttcggaatcgttttccgggacgccggctggatgatcctccagcgcggg gatctcatgct 

CTCTAAAGCTAAGGrGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGA 
B ' R F " R R 1 L .- K V G <- R W R F P G R R L Q Q p p A R G s H A 

GGAGrTCTTCGCCCACCCCAACTTGTTTAT TGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTG 

cctcaagaagcggoTggggttgaacaaataacgtcgaatattaccaatgIttatttcgttatcgtagtgIttaaagtgtttatttcgtaaaaaaagtga- Se '" ,: 

G V L R P p 0 L v y c S L . V L OIKQ.HHKFHK.S!FFT 

"" I. I II--. 

CATTCrAGTTGTGGTTrGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGG rCArAGCTGT 
GrAAGATCAACACCAAACAGGTTTGAGTAGTTACATAGAATAGTACAGACATATGGCAGCTGGAGATCGATCTCGAACCGCArTAGTACCAGTATCGArA ^ 
A ^ •*- v ^V QTHQC tLSCLYTVDL ■ LELGV I M V t A V 

TTCCTGTGTGAAArTGrTATCCGCTCA CAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAG TGAGCTAACTrAC 
AAGGACACACTTTAACAATAGGCGAGTGTTAAGGrGTGTTGTATGC TCGGCCTTCGTATTTCACATTTCGGACCCCACGGATTAC TCACTCGATTGAGTG 
S C V K L L S A H N S T 0 H T S R K H K V . S L G C L H S £ L T H 

ATTAATTGCGTTGC GCrCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGT TTGCGT 

taattaacgcaacgcgagtgacgggcgaaaggtcagcccIttggacagcacggtcgacgtaattacttagccggttgcgcgcccctctccgccaaacgca 
ncv altarf pvgkpvvpaalmn rptrge r r f a 

ATTGGGCGCTCTTCCGCTTCCTCGCTCACTG ACTCGCTGCGCrCGGTCGTTCGGCTGCGGCGAGCGGTATC AGCTCACTCAAAGGCGGT AAYACGGTTAT 
fAACCCGCGAGAAGGCGAAGGAGCGAGTGACTGAGCGACGCGAGCCAGCAAGCCGACGCCGCTCGCCATAGTCGAGTGAGTTTCCGCCATrATGCC AATA ^ 
* * L " " " 1 A " . ■ L A * »- G » S A A A S G ISSLKGGMrvl 



WO 98/24810 125/270 PCT/EP97/06956 



lucauay. 10 govern oer 1997 13.S a 
pLM3 (1> 10847) Site and Seouenca 



Page U 



CCACAGAATCAGGGGATAACGCAGG AAAGAACATGTG AGCA AAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGflrrfl rr:rT/:r- 

, R K £ HVSKRPAKGQEP ' ' ~ 



K GRVAGVTP 



rcCCTCGTGCGCTC TCCTGTTCC GACCC TGCCGC 7 



A K P 0 R T | 



< I P G V S 



P V K 



• CTrACCGG4rACCTGTCCGCCTTTCT ^ CT ^GGGAAGCGTGGCGCTT TCTCAAT 6 CTCArnrrr. TJ 

GAGGACAAGGCTGGGACGGCGAArGGCCTArGGACAGGCGGAAAGAGGGAAGCCCTTCGCACC GCGAAAGAGTTACGAGTGCGACAT 
1 ' L S C S ° P * * I » ! ,P V » L S P r G K R G A P S M L T L 



GGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGG CTGTGTGCACGAACCCCCC6TTCAGCCCGArr nrTnr/:r/-T . 1, T , , 

CCA T G ; CA r CAC :T CCAGCA _ „ 

■ ■ G " G " 3 V ° A G L C A » T ■» • S A R p L R L , „ L s s 



TGAGrcCAACCCGGTAAGACACGACTTATCGCCACr G GCAGC AGCCACTG G rAACAGGAr TAGCAGAGC G AG GT A T , T ,.^^ T ^ 

^^^^^^^^^^^^^^^ ^^ ^^^^fC r CGCTCCATACATCCGCCACGATGTCTCAAGAAi: 

1 * tf 5 E VCRRC YRVL 



950C 



AAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGrAT T T GGTATC T GCGCrCTGCTGAAGCCAGTTA rrTTf -.. rTrTTf , 

'■ ■ • " 0 s 1 v Y L . R s * e * s r L R K ,< s v , L L , 

CCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGT TTGCAAGCAGCAGATTACGCGCAGAAAA AAARitATr Tr a 

' F C L ° A A ° Y * ° * " * 'SRRSFOLF 



" 0 T N H R w 

r«. 8M ,c, g ,«c, I «ro t « < coA.^, I .«„ .« 66a ., rT „„ c . I8>e „ I „ c <>> ^ | , ji „„ ji ,„ 

.• H5V£RK lTLR0F G ME| IKKOLHL 

„ , T T ' """'""• II,t " T "' : "'' i °'"'""""""""°"»«»»co„. w>a .,.,.; 

,NLKyi v NLV.QLPMLNQ 



IOCK 



; rr c t ^ 

CMAGrAGoTATCAACGGACTGAGGGGCAGCACATCTATTGATGCTATGCCC ^^CGAATGGTAGACCGGGGTCACGACGTTACTATGGCGCTCTGGGTGC 
' ^^VONYD T GGLT I VPQCCNO TARP T 

CrCACCG.3CTCCAGArTTATCAGCAATAA A CCAGCCAGCC GGA,GGGCCGAGCGCAGAAGTGGTC CrGCAAr TTTATf -^. T 

^^^^^T^^^G^^^S^^^B^^^F^ ^^^*^ ' ^" T AT* TTGG TCGG T C 3GCC TTCCCGGC TCGCGTC T TC AC C AGGACG^TT'oi tt AA T^GGCGG AGG TAGG TC AGA TA A ™« 
' 1 ^^^F I R L H P V Y 

^U.iCGArCTCArTCArCAAGCGGTCAArTArCAAACGCGTTGCAACAACGGrAACGATG rc^-T^rrArr^^. > 

U L P G S s K - A c At-GATG ' C^oTAbC ACCAC AGTGCGAGCAGCAAACC A " 

A S F A 0 R C C " C Y R H p r w t 



WO 98/24810 126/270 PCT/EP97/06956 



* Tuesday. 1 8 November 1 997 1 3 :S8 

PLM3 (1 > 10847) Site and Sequence Pa 9 B 1 ** 

' ^SCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTT CGGTCCTCCGATCGTTGT 

accgaagtaagtcgaggccaagggttgctagttccgctcaatgtac tagggggtacaacacgttttttcgccaatcgaggaagccaggaggctagcaaca '°* 

C F I 0 L R F P T I K A S Y M I P H V V Q K S G . L L R S S 0 R C 

CaGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGC AGCACTGCArAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGT GACTGGT 

GTCTTCATTCAACCGGCGTCACAATAGTGAGTACCAATACCGTCGTGACGTATTAAGAGAATGACACTACGGTAGGCATicTACGAAAAGACACTGACCA '°* 
^ * . VGRSV I T HGYG5TA . FS YCHA'I RKMLF CDV 

GAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAG CAGAACTT 
CTCArGAGTTGGTrCAGTAAGACTCTTATCACATACGCCGCTGGCTCAACGAGAACGGGCCGCAGTTATGCCCTATTATGGCGCGGTGTATCGTCTTGAA ''^ 
■ V L » 0 V '. L R ' V V * A T ELLLPGVNTG.YRAT.QNF 

' ' ' "*"' ill I i - ■ 

TAAAAGTGCTCATCATTG6AAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGC ACCCAA 

ArTTKACGAGTAGTAACCTTTTGCAAGAAGCCCCGCTTTTGAGAGTTCCrAGAATGGCGACAACTCTAGGTCAAGCTACATTGGGTGAGCACGiGGGr! '°* 
K S A H H V K T F F G A K T L K D L T A V £ I 0 F 0 V T H S C TO 

CTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCT6GGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAAT GT 

gactagaagtcgtagaaaatgaaagtggtcgcaaagacccactcgtttttgtccttccgttttacggcgttttttccctIattcccgctgtgcctttaca ,07< 

1 LU 1 F Y F H °. R F * v S < N R K A K C R K K G N K G 0 T E M 

TGAATACTCAT ACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCAT6A6CGGATACATATTTGAATGTATTTAGAAAAATAAAC 
ACTTATGAGTATGAGAAGGAAAAAGTTATAATAACTTCGTAAATAGTCCCAATAACAGAGTACTCGCCTATGTATAAAciTACATAAATCTTTTTATTTG '°* 
LNTHTLP FS I LLKHLSGL L S H £ R I H I .MVLEK.T 



AAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTG ACGTC 
TTTATCCCCAAGGCGCGTGTAAAGGGGCTTTTCACGGTGGACTGCAG 
NRGSAHISPKSAT.R 



127/270 



PCT/EP97/06956 



r.v^ « ^. Se i UCnce of pTB72 ' an ex P r «s'on vector incorporatingC. elegans 
UNC-53. The Open reading frame (ORF) of the prolinker + O-UNC-53 and 
(upper ORF) Ce-UNC53 alone (lower ORF) are listed under the sequence. 

r^^r^^^Jf^' ehanoino ori on drdexequenos£reaOM: maandac. 8 M 1006 0927- 

gtcgccacgtcagcaaccgcttcagcaactaacccaaattccaactttccaca^tgtcaaS 

ACTCCACAGTCAAGAATATCGAAAA ^^ 

? X5 A ^.T9A^ J C J^^.?.3^ rCA TC AAATAA TAG AAATT C ATT CC G TCC G TC GA GCCGT^C QAQ TGGC AATAATAAT 

GJTGGCTCGACGATATCCACATCTGCGAAGAGCmGAATCATCATC 

CG ^ CC I ACCTCCCAACTCCAAAAACC ^ CTAG ^^ 

CGAA ;?f/ CA ^ GCTAGCCGCTCCG ^^ 



"*'*'"*''"""*■' " " "*" TCGAAAATC 

CGAAGCTTT 
GCTCAAAG/ 
iGGTTCCCT 



1GAAGCTTT 
5CTCAAAGJ 



G Z A 5 G - C . CAA ^ G . TTTCCTACCGTAAAACGGACG C<^ 



AAACAAGC 
5GCTCATG 
kAGAATAAC 
5CTGAAAG 
GTCGTCT 

gg c t ^i; a ^ 



CTCATG 
oAATAAC 
TGAAAG 



£aLaa!cc C I£^ 



G^CTCACTAlcCAt^ 
rA C J C ^? TAA ? CAAGAAACAGGAGAACTATGGAGCA ^ G TTTOATCTTT^ 



TG^G^^A^ 



AGA C ct?GGCTG^^ 



CO I C £ G J^ ATr ™^ 

rTGTTAATATCAGCATTCCTGAAAA 
•AAGCAAAGAATCATGCATCGTAAT 



aZc^gtacagatgc^tcagagctcttc^ 

ATTGAGAAAAC^nC^G^A 0 ^^^^ 



WO 98/24810 128/270 PCT/EP97/06956 



TCCCCTGAATGGTTCATTCGATTGTGGAATGAGAACTTCATTCCATATTTGGAACGTGTTGCTAGAGATGGCAAA 
A AAACCTTCGG TCGCTGC ACTTCCTTC GAGGATCCC AC CGACATCGTCTC TAAAAAATGGCCGTGGTTCGATGG 
TGAAAACCCGGAGAATGTGCTCAAACGTCTTCAACTCCAAGACCTCGTCCCGTCACCTGCCAACTCATCCCGAC 
AACACTTCAATCCCCTCGAGTCGTTGATCCAATTGCATGCTACCAAGCATCAGACCATCGACAACATTTGAACAG 
5 AAGACTCTAATCTTCTCTCGCCTCTCCCCCGCTTTCCTTATCTTCGTACCGGTACCTGATGATTCCCCATTTTCC 
CCCTTrTCCCCCCAATTTCCCAGAACCTCGTGTTCCGTTTGTTCCTAGTCCTCCCGGGTGCCGACGCCGAAGCG 
ATTTAAAAACCTTTTTCTTTCCGAAACATTTCCCATTGCTCATTAATAGTCAAATTGAATAAACAG 
AAAAAAAAAAAAAAAAAAACTCGAGGGGGGGCCCTATTCTATAGTGTCACCTAAATGCTAGAGCTCGCTGATCAG 
m CCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGT 
10 GC CAC TCCCACTGTCC TTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTC ATTCTATTCTG 
GGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGT 
GGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGG 
CGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGC 
TCCTTrCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGCATCC 
1 5 CTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGT 
GGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTT 
CCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTrTGGGGATTTC 
TTGG7TAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGT 
GGAAAGTCCCCAGGCTCCCCAGGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGT 
20 GGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCC 
GCCCC TAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTT 
mi iATTTA TGCA GAGGCCG^GGCCGCCTCTGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGA 
GGCCTAGGCTTTTGCAAAAAGCTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATCAAGAGACAGGATGAGG 
ATCG7TTCGCATGATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGC 
25 TATG ACTGG GCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCG 
GTTCTT7TTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGC 
TGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATT 
GGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGAT 
GCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGC 
30 GAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGC 
CAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCAT GCCC GACGGCGAGGATCTCGTCGTGACCCATGGCGATG 
CCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGC 
GGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGC 
TTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTT 
35 CTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCA 
CCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGG 
GGATCTCATGCTGGAGTTCTTCGC CCACC CCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAG 
CATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGT^ 

ATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAArr 
4U GTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTG 
AGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTCCCAGCTGCATTA 
ATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCG 
CTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAAT 
CAGGGGA TAACG CAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTT 
45 GCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGA 
AACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCT 
GCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGG 
TATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCT 
GCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACT 
5U GGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCT 
ACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT 
TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGG I I I I 1 1 I GTTTGCAAGCAGCAGATTACGCGCAGAAAAAA 
AGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGA 
TTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTT7AAATTAAAAATGAAGTTTTAAATCAATCTAA 
AGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTA 
TTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCC 
CAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAA 
GGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGA 
GTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCA7TGCTACAGGCATCGTGGTGTCACGCTCGTC 
OU GTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGT7GTGCAAAA 
AAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATG 
GCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAG 
TCArTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCCCCACA 
TAGCAGAACTTTAAAAGTGCTCATCAT7GGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTT 
03 GAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGG 
GTGAG CAAAA ACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATAC 
TCT7CCTTTT7CAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAG 
AAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCT 
CCCGATCCCCTATGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGC 
/U T7GTGTGT7GGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATT 
GCATGAAGAATCTGCTTAGGGT7AGGCGTT77GCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACAT 
TGAtTArTGACTAGTTATTAATAGTAATCAArrACGGGGTCATTAGrrCATAGCCCATATATGGAGTTCCGCGTTA 
CATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATCACGTAT 
GTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACrr 
O GGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTA 
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*ZZ%£l£\£ x Asn J al Glu Leu 116 Pro Iie 

Lys 1 s a er S ?lJ r l*T ^ ^ ^ ^ Ser 

vaf SJ«2 a LJ" n Asp 45 Phe Arg Asp Tyr Arg Leu 

Pro^^rkJ 16 Val 60 Pr ° ASn G1U Phe S " 

Gly y Lu r c 9 lu Le ?h Ala Ly %^ le Thr S " AS " Leu A,p 
Cys'ier^yf E.?' AS ° LeU G1 * A.p 

Ala Th Jai y Lu h G r i ASP "to?* AS " LeU G1 ^ 

Leu Le U e c U ln h L e eu LeU Thr T ** LyS Gln "*» 

Thr y S 5 er y ::-rSe Gln Ly ? 3 £ y- L6U G1 " G1 " Leu P ™ 

v a r r Aia pr ? h Ai s a e r ai se [ 50 Lys Leu pr ° ser p - ^ 

Ph A1 ? r o h G r i Ala e f r Al Ul hC Pr ° AS " Ser A - 

Ile^rU^lir 9 ^ ^ S " ^ 

SerZly'LTiy? 5 lle l9 t lY ^ Pr ° LyS Thr 

T h rLn Pr s e: : e ; h r r se 2 r 10 Thr ^ s ~ s ^ *.» 
va Ar G 9 i Pr s o er T r h r r Ar ? 2 f er ser Giy AS " 

SeJIh^yrSer" S " ^ ^ Ser S " 
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Ser lie Ser Asn Leu Asn Arg Pro Thr Ser Gin 
Leu Gin Lys Pro 255 

Ser Arg Pro Gin Thr Gin Leu Val Arg Val Ala 
Thr Thr Thr Lys 270 

He Gly Ser Ser Lys Leu Ala Ala Pro Lys Ala 
Val Ser Thr Pro 285 

Lys Leu Ala Ser Val Lys Thr He Gly Ala Lys 
Gin Glu Pro Asp 300 

Asn Ser Gly Gly Gly Gly Gly Gly Met Leu Lys 
Leu Lys Leu Phe 315 

Ser Ser Lys Asn Pro Ser Ser Ser Ser Asn Ser 
Pro Gin Pro Thr 330 



Arg Lys Ala Ala Ala Val Pro Gin Gin Gin Thr 
Leu Ser Lys lie 345 

Ala Ala Pro Val Lys Ser Gly Leu Lys Pro Pro 
Thr Ser Lys Leu 360 

Gly Ser Ala Thr Ser Met Ser Lys Leu Cys Thr 
Pro Lys Val Ser 375 

Tyr Arg Lys Thr Asp Ala Pro lie He Ser Gin 
Gin Asp Ser Lys 390 

Arg Cys Ser Lys Ser Ser Glu Glu Glu Ser Giy 
Tyr Ala Gly Phe 4 05 

Asn Ser Thr Ser Pro Thr Ser Ser Ser Thr Giu 
Gly Ser Leu Ser 420 

Met His Ser Thr Ser Ser Lys Ser Ser Thr Ser 
Asp Glu Lys Ser 435 

Pro Ser Ser Asp Asp Leu Thr Leu Asn Ala Ser 
He Val Thr Ala 450 



He Arg Gin Pro He Ala Ala Thr Pro Val Ser 
Pro Asn He He 465 



Asn Lys Pro Val Glu Glu Lys Pro Thr Leu Ala 
Val Lys Gly Val 480 

Lys Ser Thr Ala Lys Lys Asp Pro Pro Pro Ala 
Val Pro Pro Arg 4 95 

Asp Thr Gin Pro Thr He Gly Val Val Ser Pro 
He Met Ala His 510 
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Lys Lys Leu Thr Asn Asp Pro Val lie Ser Glu 
Lys Pro Glu Pro 525 

Glu Lys Leu Gin Ser Met Ser lie Asp Thr Thr 
Asp Val Pro Pro 540 

v. ^ e iL Pr0 Pro Leu Lys Ser Val Val Leu Lys 

Met Thr Ser He 555 y 

Arg Gin Pro Pro Thr Tyr Asp Val Leu Leu Lys 
Gin Gly Lys He 570 Y 

Thr Ser Pro Val Lys Ser Phe Gly Tyr Glu Glr 
Ser Ser Ala Ser 585 

w ? 1 ^ Asp Ser Ile Val Ala His A1 * Ser Ala Gin 
Val Thr Pro Pro 600 

Thr Lys Thr Ser Gly Asn His Ser Leu Glu Ara 
Arg Met Gly Lys 615 9 

Asn Lys Thr Ser Glu Ser Ser Gly Tvr Thr Ser 
Asp Ala Gly Val 630 

Ala Met Cys Ala Lys Met Arg Glu Lvs Leu Lys 
Glu Tyr Asp Asp 645 * 

Met Thr Arg Arg Ala Gin Asn Gly Tyr Pro Asd 
Asn Phe Glu Asp 660 

a. Se ^ Se r r Ser L€U Ser Ser Gly Ile Ser As P Asn 
Asn Glu Leu Asp 675 

Asp He Ser Thr Asp Asp Leu Ser GIv Vai Asd 
Met Ala Thr Vai 690 

Ala Ser Lys His Ser Asp Tyr Ser His Phe Vai 
Arg His Pro Thr 7 0S 

Ser Ser Ser Ser Lys Pro Arg Val Pro Ser Arg 
Ser Ser Thr Ser 720 

w Y a i ASP Ser Arg Ser Ar< ? Ala Gi u Gin Glu Asr. 
Val Tyr Lys leu 735 

Leu Ser Gin Cys Arg Thr Ser Gin Arg Gly Ala 
Ala Ala Thr Ser 750 

Thr Phe Gly Gin His Ser Leu Arg Ser Pro Glv 
Tyr Ser Ser Tyr 765 

Ser Pro His Leu Ser Val Ser Ala Asp Lys Asp 
Thr Met Ser Met 780 
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His Ser Gin Thr Ser Arg Arg Pro Ser Ser Gin 
Lys Pro Ser Tyr 795 

Ser Gly Gin Phe His Ser Leu Asp Arg Lys Cys 
His Leu Gin Glu 810 

Phe Thr Ser Thr Glu His Arg Met Ala Ala Leu 
Leu Ser Pro Arg 825 

Arg Val Pro Asn Ser Met Ser Lys Tvr Asp Ser 
Ser Gly Ser Tyr 840 

Ser Ala Arg Ser Arg Gly Gly Ser Ser Thr Gly 
He Tyr Gly Glu 855 

Thr Phe Gin Leu His Arg Leu Ser Asd Glu Lvs 
Ser Pro Ala His 370 

Ser Ala Lys Ser Glu Met Gly Ser Gin Leu Ser 
Leu Ala Ser Thr 885 

Thr Ala Tyr Gly Ser Leu Asn Glu lvs Tyr Glu 
His Ala He Arg 900 

Asp Met Ala Arg Asp Leu Glu Cys Tvr Lys Asn 
Thr Val Asp Ser 915 

Leu Thr Lys Lys Gin Glu Asn Tvr Glv Ala Leu 
Phe Asp Leu Phe 930 

Glu Gin Lys Leu Arg Lys Leu Thr Gin His He 
Asp Arg Ser Asn 945 

Leu Lys Pro Glu Glu Ala He Arg Phe Arc Gin 
Asp He Ala His 960 

Leu Arg Asp He Ser Asn His Leu Ala Ser Asn 
Ser Ala His Ala 975 

Asn Glu Gly Ala Gly Glu Leu Leu Arg Gin Pro 
Ser Leu Glu Ser 990 

Val Ala Ser His Arg Ser Ser Met Ear Ser Ser 
Ser Lys Ser Ser 1005 

Lys Gin Glu Lys He Ser Leu Ser Ser Phe Glv 
Lys Asn Lys Lys 1020 

Ser Trp lie Arg Ser Ser Leu Ser Lvs Phe Thr 
Lys Lys Lys Asn 1035 

Lys Asn Tyr Asp Glu Ala His Met Fro Ser lie 
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Gly Thr Leu Asp Asn lie Asp Val lie Glu Leu 
Lys Gin Glu Leu 1065 " 

Lys Glu Arg Asp Ser Ala Leu Tyr Glu Val Arg 
Leu Asp Asn Leu 1080 

Asp Arg Ala Arg Glu Val Asp Val Leu Ara Glu 
Thr Val Asn Lys 1095 

Leu Lys Thr Glu Asn Lys Gin Leu Lys Lys Glu 
Val Asp Lys Leu 1110 

Thr Asn Gly Pro Ala Thr Arg Ala Ser Se- Arg 
Ala Ser He Pro 1125 

Val He Tyr Asp Asp Glu His Val Tvr Asd Ala 
Ala Cys Ser Ser 1140 

Thr Ser Ala Ser Gin Ser Ser Lys Arg Se- Ser 
Gly Cys Asn Ser 1155 

He Lys Val Thr Val Asn Val Asp He Ala Gly 
Glu He Ser Ser 1170 

lie Val Asn Pro Asp Lys Glu He lie Va' Glv 
Tyr Leu Ala Met 1185 

Ser Thr Ser Gin Ser Cys Trp Lys Asd He Asp 
Val Ser He Leu 1200 * 

Gly Leu Phe Glu Val Tyr Leu Ser Arg He Asp 
Val Glu His Gin 1215 

Leu Gly lie Asp Ala Arg Asp Ser lie Leu Glv 
Tyr Gin He Gly 1230 

Glu Leu Arg Arg Val lie Gly Asp Ser Th - Thr 
Met He Thr Ser 1245 

His Pro Thr Asp He Leu Thr Ser ^e" T*- Thr 
He Arg Met Phe 1260 

Met His Gly Ala Ala Gin Ser Arg Val Asd Ser 
Leu Val Leu Asp 1275 

Met Leu Leu Pro Lys Gin Met He Leu Glr. Leu 
Val Lys Ser He 1290 

Leu Thr Glu Arg Arg Leu Val Leu Ala Glv Ala 
Thr Gly He Gly 1305 

Lys Ser Lys Leu Ala Lys Thr Leu Ala Ala Tyr 
Val Ser He Arg 1320 
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Thr Asn Gin Ser Glu Asp Ser He Val Asn He 
Ser He Pro Glu 1335 



Asn Asn Lys Glu Glu Leu Leu Gin Val Glu Arg 
Arg Leu Glu Lys 1350 

He Leu Arg Ser Lys Glu Ser Cys He Val He 
Leu Asp Asn He 1365 

Pro Lys Asn Arg He Ala Phe Val Val Ser Val 
Phe Ala Asn Val 1380 

Pro Leu Gin Asn Asn Glu Gly Pro Phe Val Val 
Cys Thr Val Asn 1395 

Arg Tyr Gin He Pro Glu Leu Gin He His His 
Asn Phe Lys Met 1410 

Ser Val Met Ser Asn Arg Leu Glu Gly Phe He 
Leu Arg Tyr Leu 1425 

Arg Arg Arg Ala Val Glu Asp Glu Tyr Arg Leu 
Thr Val Gin Met 1440 

Pro Ser Glu Leu Phe Lys He He Asp Phe Phe 
Pro He Ala Leu 1455 

Gin Ala Val Asn Asn Phe lie Glu Lys Thr Asn 
Ser Val Asp Val 1470 

Thr Val Gly Pro Arg Ala Cys Leu Asn Cys Pro 
Leu Thr Val Asp 1485 

Gly Ser Arg Glu Trp Phe He Arg Leu Trp Asn 
Glu Asn Phe He 1500 

Pro Tyr Leu Glu Arg Val Ala Arg Asp Gly Lys 
Lys Thr Phe Gly 1515 

Arg Cys Thr Ser Phe Glu Asp Pro Thr Asp He 
Vai Ser Lys Lys 1530 

Trp Pro Trp Phe Asp Gly Glu Asn Pro Glu Asn 
Val Leu Lys Arg 1545 

Leu Gin Leu Gin Asp Leu Val Pro Ser Pro Ala 
\sn Ser Ser Arg 1560 

Gin His Phe Asn Pro Leu Glu Ser Leu He Gin 
Leu His Ala Thr 1575 

Lys His Gin Thr He Asp Asn He 
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/7<£^ : tblastn search of the EST division of Genbank with the ORF of the 
longest knovniCe.UNC.S3 cONA, tb3.MS, reveais two EST's with homology to a 
predicted coiled-coil region in Ce-UNC-53. 

5 If^tf™ 1 ' 4 ' 9MP t 26 - March -19961 fBuild 14:27:13 Apr 1 



10 



15 



20 



25 



35 



40 



Rafar.nc: Alt«chul, Sc.phan r. , Marran Cl.h, Wabb Millar e u .._. u 

Quory- tb3 H5 OW 

(ISfj lattora) 

Oataba.oj Morwodundant Databaao of ConBank EST Division 
647,253 siquancos; 234,216,808 total lattora. 



J. Mol. 



9b|H09O36lH09036 
9blAA04 9124|AAO4 9124 
9blR91475|R91475 
ablT23446JT23446 
9blR86390|R86390 
9blT4478UT44781 
9blT7S582lT7S382 



yl96cll.rl Homo sapi«n PcDNA clo ! 
a)46f04.cl Soaro* mouao ambryo N . 
yq08cll.rl Homo sapiens cONA clo. 
»Qq2955 Homo sapiens cONA clono . 
SW3ICA339SK Bruql* malayl infoct. 
8044 Arabldopsis thallana cDNA c. 
yd63fll.rl Homo sapiens cONA clo. 



Roadlntj Hi9h 
Frame Scoro 
358 



Smallest 
Sua 
Probability 



177 
115 
106 
59 
61 
74 
71 
64 



3Q 9blH09036iH09036 yl^cll.rl Homo sapiens cONA Clono 46037 3' 
v i*<* n 9 1 h ■ 489 



P(N) 
7.9o-54 
8. 6e-l6 
l.le-OS 
8.6e-05 
0.21 
0.99 
0.996 
0. 9992 
0. 99992 



Plus Strand HSPs: 
Scoro - 115 (52.1 bits). Expect - l.le-OS. P - l.le-OS 
Identities - 22/70 1311). Positlvo. 



45/70 (64%), Tr 



Quory: 
Sb)ct : 
Quory: 
Sbjct: 



1059 



9 me 

L ♦♦KL* v, A * * D LRE *N****£ **LX •» 
M ° tRNELaDKEW ^0rRLEALSSAH0L00LREAHNRMQSE I EKI^VXN0RLKSES0CSC 186 



..19 SSRASrPVIT 1128 
SR S P *• 
187 CSRGSFPSVH 216 



45 



50 



55 



9b.AA04912<lAA04912« mj46f04.rl So.:., mouae .mbryo Nb ME 13.5 14 5 Hu, 

musculus cONA clono 479167 5* " J,a *** 5 Mus 

Lonqtn - 337 
Plus Strand HSPi: 

f5! C ?.a". 106 'I!' 0 blt5> ' E *P«« • 8.6O-0S, p - 8.6O-05 
Identity. - 23/58 |39%|. Positive . 38/38 (65%), frame. - -J 



Quory: 
Sbjct: 



1057 D V Z E j«K0E LKER0 S A LYE VR LON L0 RARE V 0 VLRE TVHK LK TENK OLKKCV0X LTNGP 1114 
* tL &u E ** L ♦•RL* L* A **0 LAST/** ** E LK c n.i o 
99 EVSELRSELWEKEMKLTOIRLEALNSAHQLDOLRE^ ?7Z 
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A Search of (he Genbank databases with part of the nucleotide 
binding domain of Ci-UNC-53 does not identify statistically significant proteins 
except for the C elegans cosmid containing Ce~unc~53. 

TBLASTN 1.4.8KP [ 2C-Jun«-l 995 ] (Build 18:00:05 Aug 29 199S) 
Query- sections (240 letters) 

>lcl I sections 

I LTERRLVLACATGIGKSKLAJtTLAAYVS I RTNQSEDSI VNE SIPENNKEELLQVERRLE 
XILRSKESCIVIU3NIPXNRIAFWSVFANVPIONNEGPFWCTVNRYQIPELQIHHNFK 
MS VMSNRLECr I LRY. LRARAVEDE YRLTVQMP SELFKIXDFFP XALOAVNNF I EKTNSVD 
VTVCPRACLNCPLTVOCSRtWFIRLWNENriPYLERVAROCKKNLRSLHFUlCSHRHRLX 



Database: 



Non-f«cur.d»nt P0B*GBupdate*Gen8ank*EMBLupdate*EMBL 
520.383 sequences; 367,017,413 total letters. 



Sequences producing 
embl 2 4 7610 I CEF4SE1C 
gblR4l07HR41071 
gb|T44781 IT44781 
embU48334IC£F10BS 
gb(M81884 IEPFCPCC 
oblL09547|PEAPCLP 
gb I H32 60 4 I TOMCD4B 
embt X 691 88 I APTUSGA 
gb(T44782IT<4782 
gblM1708 7lHUMRASX:; 
embl X5 7702 IGGNATRI'-'P 
gblX01520IHUMRASX31 



nigh-scoring Segment Pairs: 



Reading High 
Frame Score 



Caenornabditis elegans cosmid F45... -2 1131 

Hk57S-f Homo sapiens cONA clone k... *2 53 

8044 Arabidopsis thaliana cOMA cl... +1 74 
Caenorhabdi tis elegans cosmid F10... O 71 
Epifagus virginiana chloroplast c... *l 49 
Pisum sativum (clone pCLp) nuclea... *1 71 
Tomato ATP-dependent protease (CD... *1 71 
A.pyhllltidls nANA for gamma-tubulin *2 56 

8045 Arabidopsis thaliana cONA ci . . . «-l 68 
Human c-ras-Ki-2 activated oncoge... *2 S8 
G.gaiius RNA Cor precursor of nat... O 56 
Human iung adenocarcinoma (PR371 ) . . . *-2 57 



>gb|R41C71 IR41C71 Hk575-f Homo sapiens cONA clone kS75-f. 
Lengch • 310 
Plus Strand HSPs: 

Score - 53 (24.5 bits). Expect - 0.40. Sum P (2) - 0.33 
Identities - 9/15 (60%), Positives - 13/15 (86%), Frame - *2 

Query: 130 GFILRTLRRRAVEOE 144 

GF**RYLRR* VE ♦ 
Sbjct: 26 GFLVRYLRRXLVESO 70 

Score - 47 (21.7 bits). Expect - 0.40, Sum P(2) - 0.33 
Identities - 9/26 t34*l. Positives • W/26 (65%). Frame • O 

Ouery: 1 7C NNFI ZX7MSV0V7VGPRACLNCPL7V 195 

• F-EX ♦••O -GP I* PL ♦ 
Sb)ct: 147 HTFLEXHSTLOFLIGPCFFLSGPLAL 224 



Smallest 
Sum 
Probability 



P (N) 
5.1e~15B 
0.33 
0.3S 
0.83 
0.91 
0. 99 
0. 99 
0.992 
0. 9995 
0. 9998 
0. 9999 
0. 99995 
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Three frame translation of EST gb:R4107I. 

Regions of homology region with Ce-Unc-53 in two different frames are underlined. 
The spacing between the blocks of homology is of similar size as that in Ce-UNC-53. 
Subsequent re-cloning and re-sequencing of this region in man identified multiple 
sequencing errors gb:R4107l, and identified an ORF which is more homologous to 
and co-linear with Ce-UNC-53 (see alignment in fig 12). 



CTCCAACAACGTGGAGCCAGCCAATGGCTTCCTGGTTCGTTACCTGAGGAGGAAGCTGGT 
, „ 10 2 0 30 40 50 60 

LQQRGASQWLPGSLPEEEAG 
SNNVEPAN G F L V R Y T. P R K T. V 
P TTWSQPM A S, W F V T « G G S W * 

AGAGTCAGACAGCGACATCAATGCCAACAAGGAAGAGCTGCTTCGGGGTGCTCGACTTGG 
9A t» „ 70 80 90 100 HO 120 

20 RVRQRHQCQOGRAASGCSTW 
S g ^ P . S D I N ANKEELLRGARL 

SQTATSMPTRKSCFGVLDLG 

GTACCCAAGCCTGTGGTATCATCTTCCACACCTTCCTTGAGAAGCACAGCACCTTAGACT 
„ i 30 1^0 150 160 170 180 

V Y P p K c * v u V S S S T P S L R S T A P • T 
_ rt YH LPHLP *EAQHLRL 

TQACGIIF HTFLRKHgTi.np 

TTCTCATCGGCCCTTGCTTCTTTCTGTCGGGTCCATTGGCATTGAGGCTTCCGGACCTTG 
w „ 19° 200 210 220 230 240 

FSSALASFCRVHWH * G F R T L 

S r". RPLLLSV GSIGIEASGPC 
L 1 2 2 C E E L S G P L A L R L P D L V 

TTTATTGACCTGTGGACAACTCTATCATTTCCTATCTACAGGAGGAGCCAAGGATTGGAT 
250 260 270 280 2 90 300 

, D LWTT LSFP I YRRSQGLD 

v L . T n C w G ^ ° L Y " F L S T G G A K D W I 
Y * PVD NSIISYL0EEPRIG» 

AAAGGTCCAT 
310 

a* K G P 

45 K V H 

R S 



WO 98/24810 



138/270 



PCTYEP97/06956 



S bl *«» ^rch of (he EST division of Cenbank with Huh 



unc-53/i cDNA 3b. 



10 
15 

20 

25 

30 

35 



3LASTN 1.4.9MP (26-Maren--996) (Build 14:27:07 Aor 1 I9«i 
Cuery- Hu-unc-53/1 cONA 3b ,0 Apr 1 19961 

(3256 Uu«rs| 

Database: Non-redundant database of GenBan* EST Division 
647,253 sequences; 234.2:6.808 total letters. 



Sequences producing 
9b|N366S9l NJ6659 
53/1 

golAA04 3 9 97f AA04 399 7 
<?bl AA049124 IAA049124 
<jbt T0S560I T05S6O 
gblN24681 IN24681 
gb| R41071 IR41071 
gb!N891O4IN89104 
?blft41073IR41073 
gb!R!S492|RlS492 
Qbl H09036I H09036 
gb|W91S67fW91S67 
gb|W744O0tW7440O 
gblAA0033l4 IAA0O3314 



High-scoring Segment Pairs: 
yx91b09.rl Scmo sapiens cONA clone 2.. 

iXS8*01. rl Soarea pregnant uterus Nb. . 
«)46J04.rl Soares mouse embryo NbMEl 
ESTC3449 Homo sapiens cONA clone HFB. .* 
?!f:i? 39 * sl Homo "P lM * cONA clone 2.. 
2^I!l f _ Homo »*Pi«n» CONA clone *c575-C 
JJ?if'/ et * A n «* rt » Lambda ZAP Expre.. 
2S15J"i " Omo **P i#n » CONA clone *144-f 
HH4J4-F Homo sapiens cONA clone H434-F 
yl96cll.rl Homo sapiens cONA clone 4. 
MTA.C36.093.A .MTA aduit mouse thymus." 
zd62c-0.rt Soares fetal heart NbHHl 9 . . 
mg56hl0.rl Soares mouse embryo NbMEl . 



High 
Score 
1666 

1316 
1324 
092 
782 
S3S 
451 
555 
416 
436 
317 
243 
141 



Smallest 

Sum 

Prcaaoility LOCUS 

N assignment 
2.1e-130 I hu-UNC- 



8.3e 
9 

S.ie 
9. 9e 
1.5e- 
7.3e« 
1 .Se- 
2.3e- 
9.4e- 
1 . 9e- 
2.Te- 
0. J4 



-129 
-102 
-84 
•75 
•72 
•57 
•3« 
29 
26 
1 7 
09 



hu-UNC-S3/l 
mu-ONC-53/1 
hu-UNC-53/1 
hu-UNC-S3/l 
hu-UNC-53/1 
hu-UNC-S3/l 
hu-UNC-S3/l 
hu-UNC-53/i 
hu-ONC-53/2 
mu-ONC-53/? 
hu-UNC-53/1 



WO 98/24810 



139/270 



PCTYEP97/06956 



TBLASTN search of the Genbank sequence database with the 961 
~d ORF „, cDNA 3. of h-UNOS*,. hu-UNC-^fo™ . ^ p!lir 
wuh CUNC53 (cosmid F45E10) compared to the rest of the database 



10 
15 

20 
25 
30 
35 
40 



TBLASTN 1.4. 9MP ( 26-March-i 9961 (Build 14:27:13 Apr 1 19961 

Query* tmpseq 1 

(961 letters) 

Oat Abase: Non-redundant Gen3ank*EMBL*0DBJ*P0B sequences 
261.674 sequences; 371,416,172 total letters. 



Sequences producing. High-scoring Segment Pairs: 



Reading High 
Frame Score 



emblZ47810lCEF45E10 
gblH97501 IHUMCLI? 
e.ivD I X6 4 6 J 8 I HSRESTI N 
gblM5 87S2lECOMCRBC 
embl 211562 I SCNUFIC 
emb I X 73 2 9 7 | SCSETR? 4 
embl XS4 002 I XLXINES I N 
gblU42409lODU42409 
gblU10399l YSCHB082 
gb| U20810IATU20810 
gblL07a79lL£lKINLXXE 
gbl LOimi YSCINTANA 
qbl U28372IYSC09476 
gbl H94 3 62 I KUHLAMBBA 
gblMS8337lVACHAGMA 



Caenorhabditis elegans cosmid F45E10 

Human cytoplasmic linker protein- 

H. sapiens mANA for restin 
E.colwncrB and mcrC genes, compl... 
S.cerevisiae nufl gene 
S.cerevisiae spacer element 
X.laevis mANA for kinesine 
Dictyostelium discoideum myosin h... 
Saccharomyces cerevisiae chromoso... 
Arabidopsis thaliana cytoskeleton. . . 
Leishmania chagasi kinesin-like p. . . 
Saccharomyces cerevisiae integrin. . . 
Saccnaromyces cerevisiae chromoso... 
Human ijmin B2 (LAMB2) mANA. part... 
Vaccinia virus hemagglutinin gene. 



83 
83 
82 
82 
82 
63 
66 
77 
77 
78 
65 
82 
75 
74 



Smallest 
Sum 
Probability 
P <N> N 



2.3e-32 
0. 47 
0.47 
0.56 
0. 61 
0. 74 
0.85 
0.92 
0. 93 
0.9S 
0. 95 
0. 997 
0.9991 
0.9996 
0. 99995 



WO 98/24810 



140/270 



PCTYEP97/06956 



HUMAN UNC53-1: 
37O0bp 



e CAD27 



Source: 

Human hem cDNA 
nt 1-3241 

Human colorectal 
ni 1 57 1. 3300 



Human heancDNA 
W 1745-3374 



Human heart cDNA 
r»l 1753-3337 

Human heart cDNA 

m ln83.370n 



Human UNC5J-2 



Figure 8 



WO 98/24810 141/270 PO7EP97/06956 



A B C D E 

— " = ~" ORF 1 627 aa <4.0e*p-54 with Ce-Unc53) 



■ ■ 



phM4-3 

^— — lambda hh 3b 



HU-UncS3/1 :6.1 
kb 



pCBHO-14 |ambda CAD2? 



pCB2U 



Figure 8a 



WO 98/24810 142/270 PCT/EP97/06956 



HUMAN UNC53.lt 3700bp 

N366S9 — — 
AA 043997 mmm 

T05560 _ 
N2468I - 
R41071 MM 

N89104 - 
R4I073 



Source: 

at 2775-3 11 3 
Human pregnant uterus nt 2584-2969 
Mouse embryo nt 1067-1407 

Human nt 2803-3050 

nl 3000-3200 
nt 2284-2599 
Human fetal heart nt 3042-3247 
nt 3131-3247 
nt 1579-1669 
nt 1771-1937 



Human 



W9I567 — 
W74400 — 



Mouse thymus adult nt 31(4-3235 
Human fetal heart nt 3192-3247 



Figure 8b 
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W^GGMRRGAGCTGRCnGGRGTCG^^ l0O 
*# T .*V ' S P " " ° ° 1 ° 5 N ° * ° " " T 1 P « « C L R 

TRCCRGCTTCflGTCCCRCGRGGfiGflCCRRGGflGflGCCGRCflrTCCCflTflCCRrTOGTGGGCTGCCTGflflrCCGflrGRCCRCTCRGflGCTCCCTTCrcCCC 200 
V0L0S0C£TK C»BM SMT|GGLPCS0O0SClPSP 

a^^ft^ ^ ^CRGTGCRRRGGGCCRRCTTRCCRRCRTRGTGRGrCCCRC TGCGGCCACCRCGCCRRGRATCRCCCGC TCCRRCRGCRTCCC 300 

RRLPnSLSRcGOtTNfVSPTRRTTPRITRSMSIP 
CRCCCRCGRGGCGGCC TTCGRGC TGTRCRGCGGC rCCCRRR TGGGGRGCRCCCfGTCCCTGGCCGRGRGRCCCRRGGGRRTGR TTCCGTCRGGflrCC TTC <»00 
rHERflfCLvSGSOrtGSTLSLRCRPrGniflSGSr 

CGRGflCCCCflCGCRCGRTGTTCftCGGCTCRGTGCrGTCCCTGGCCTCCRGTGCCTCCTCCnCCTflCTCCTCRGCTGRGGRGRGGnTGCflflrCTGflGCRRfl 500 
R 0 P T 0 0 V N GSVlSlRSSQSSrySSRCCffnosgQ 
I homology Mode A 

rCCGGRflGCTTCGTRGGGflRCrGGRflTCflTCCCRGGRflRflflGTGGCCflCCTTGflCGTCrCflGCTTTCTGCCRRTGCTRRTCTGGTGCCTGCTTTTGRGCR 600 
IRKLaaCLC SSOE KVflTL rSQtSRMRlilvaprgO 

homology tfoclr a 

MGCCTGGTGRRTRTGRCflTCCCGCCTGCGRCflCCrGGCRGflGflCGGCCGRGGflGRRGGflCflCTGflGCTGCTGGflTTTGCGflGflRRCCflTRMCTTTCTG 700 
SLVNnTSB '- R "*- "grRe EK 0TELlOLRETI0f L 
homology OjOCfc A 

RRGRRRflRGRRCTCTGRGCCCCflCGCflGTCflTTCRGGGflGCCCTTRRrGCCrCflGflRflCCRCRCCCflflRGflflCTTCGGRTCRRGRGRCRflRflCTCCTCRG 800 
<ICKWSCft O RV «qGfll.NRSCTTPlCC LP IKRQNSS 
aogywocitA ~ ^ 



RTRCCRTCTCflRGCCTCflRCflGCRTCRCTPGCCRrrCCRGCRTCGGCRGCRGCRflGGRTGCTGRTGCGflRflRflGflflGRRRRRRRftGflGrTGGGTCTflTGR 
OS,SSLN SirSHSSlGSStOHORlfK K K K < S V V V t 



900 

v C 

hometogy otock 8 



GC TTCGRAGTTCC TTCRRCRRRGCGTTCRGTRTflRRRARGGGGCCCRftGrCRGCTTCC TCRTRCTCGGRTATRGRGGRGRrTGCTftCRCCCGRCTCTTCR 1000 
L B S S r H < fl f S 1 K K G P r S R S S V S 0 I C C I R r P 0 S S 
homology O4oc> 8 > 

GCCCCC TCR TC CCCCRRRC TRCRGC RT GG 7 TC TRCRGRGRC TGC TTCRCCC TC CR TCRRG TCC TC CRCC TTG TCCTCCG TGGGCRC TGRTGTCACCGRGG 1 100 
RPSSPKLOHGSTC rflSPSI«SSTLSSVGTOVT£ 

GCCCIGCTCRCCCRGCCCCCCRCPCTRGGCTGrTCCRTGCflRRTGnGGRGGRGGRGCCRGflGRRGRflGGRGGTRTCGGRGCTGCGCrCTGRGCTflTGCCfl 1200 
GPRHPAPHTQLfMRNCCECP C K % t V S C L 0 S C I * C 

| homology b»ocir C 

GBBGGflRRTGARGCTTPCRGRCR 'CCGCTTGGRGGCCCTCRRCTCTGCCCRCCRRCTGGflTCRGCTTCGGGRGRCCf TGCRCRRCRTGCRGTTCGRGCTG «300 
KCnKLTOtP> -g w t>WSPHOLOQ L RETnHNno L CV 
M homology cioch C 

GRCCrcCrGRRRGCRGRGRRTGflCCSacrGflRGGTRCCCCCflGGCCCCTCRrCflGGCTCCflCrCCRGGGCRGGTCCCTGGRTCflrCTGCRnRTCTTCCC l«0O 
° LL,t * C * 0 *l*VflPGPSSGSTPG 0 V P G S $ A L S S 
homology clock C ""^ 



■ «*a* homology in humi v» fuyn2 



CPCGCCCCrCCCrRGGCCrGGCfiCrCPCCCPTTCCrTCGGCCCCflGTCTTCCRGflCfiCPGRCCTGTCRCCCflTGGRrGGCRTCPGTRCTTGTGGTCCRRfl »S00 
PRa SlGlRLT*sr GPSLROTOLSPnOGlSTCGPr 

wean homology m humi vs hum2 ■^mmwm^^hmmi^^^^mmm 

GGRGGRRG TGRCCC'CCGGC ?GG TG 6 ! GPG GR • G CCC CCGCRGCRCRTCRTCRRR GGGGP C T TGR RG CRGCR GGRR T TC T T C C TGGGC 7G FRGCR RG G TC .600 
ccvTtBvvv5nPP0H|| KC0LK o o c r r t c c s < v 



homology c:oc» O 



"CrGCHJWSCTTCflC tGGflOGPTGC TGCP 7 G3PGCrGTT TTCCARGTCTrCPPGCPC TP rpf TTC TRRPPTGGPCCCRGCC TC JCCCC TGCGRC TRPGCR i?00 
SCKV 0»'" t5CRvrovr KO v tSicnoppsrLGis 

homology pipp Q 

C IG^GfCCRTCCRTCGCTRC PGCR'CaGCCPCGTGRRRCGRGTGTTGGRTGCRGRGCCCCCCGRGRTGCCTCCT TGCCG TCGRGG?GTCRRTRRCRTRTC »800 
rcs '"GvSls*v« Q y LOOEPPCnPPcoa&vNM is 
• fv>io»ogy Q'- oc> O *~" *^ 



WO 98/24810 144/270 PCT/EP97/06956 



flCTCTCCCrCflfl«CGTCrGAPCGS02PPTCCCTCCRCBCCCTCCTCTTCCRCRCCCTGnTCCCCRBCCCCnTCPTCCRGCflCTflC«TBfl5CCICCTCCTC i*00 
v S L t G I t E < C V OSLvrCTL tP<PfinOnv|SLLL 
| homology pocfc 6 • prad nucteonoe 80 

PRGCRCCGGCGCC TCGTCCTcr:::;c:CCRGCGGCRCGGGCflRGRCC TRCCtgpcCRPTCGCTTGGCCGRGTRCCTGGTGGRGCGCTCTGCCCGTGRGG 2000 
KHPPt V L £ 3 P S G TGK TVL TwRL RCVLVfRSGRC 

nomotogy a ecu E • pico nuclear* 60 

ICRCRGRCGGC«TCG'CaGC^C:":=PCRTGCPCCRCCRGTCTTGCPRGGPTCrGCPP:rGTRTCTTTCCflflCCTRGCCflPCCPGPTRGPCCGGGRRRC 2100 
v T C G Iv ST^snMOOSCK O CQL VLSNCPWQIOBCT 

fromoioqy Poo £ ■ p/eo nucteot-ga 60 

RGGRRTTGGGGPrGTGCCCC TGGTGR7TCTRTTGGRTGRCCTGRGTGRRGCRGGC TCCRTCRGTGRGTTGGTCRflTGGGGCCC TCRCCTGCRRGTPrCRT 2200 
G IG OVPLV iLLDOlStPGSISEtVWGPL T C < v H 

homology aocK £ » pred mjcmatid* BO 

RPRTGTCCC TATRTTR7RGG TPCCRCC RRTCRGCCTG TRRRRRTGRCRCCCRACCRTGGC TTGCACTTGRGC TTCRGGATGTTGRCCTTC TCCRRCRRCG 2300 

* CPv l 'G TT "OPvgnrPNHGiHisrpniTrsNN 

homology bloc* £ . pred nucteotid* BO 

TGGflGCCRGCCRflTGGCTTCCTCGTTCSTTRCCTGflGGRGGRJlGCTGGTWIRGTCRGRCRGCGRCRTCflflTGCCflflCRRGGflflGflGCTGCnCGGCTGCT 2000 
VCPW w S f LVPVl.RWKtveSOSOlNflMKCCL(>flVC 
homology otpc* E - t»cd nucieqtda BO 

CGRCTGGGTflCCCPRGCTGTGGTRTCqTCTCCRCRCCTTCCnGRGRnGCRCRGCRCCTCWflXTTCCrCflTCGGCCCTTGCTTCTTTCTGTCGTGTCCC 2500 
OVVPCLVyHCMTrtEKHSTSOrL tCPCTFLSCP 

homology block £ • pod nucleotide BO 

A T TGGC A TTG A GG RC T TC C G GRC C T 3G T7C R T TGRCC TG TGGRfl C A AC TC T R T CA T TC CC Tfl TC TR C AG GRR GG RG CCR RGGA TGG GR TflflRG G TCC R TG 2600 
tCICOFpryr I0LVNNS1 IPyLQCGRlCOGItVM 

homology oocir E • or eo nucleotide BP 

GRCRGflflflGCTGCTTGGGaGGRCCt?GTGGKRTGGGTCCGGGRCRCRCTTCCCTGGCCRTCflGCCCRRCflRGRCCRflTCflRRGCTG TRCCRCC TGCCCCC 2700 
G0KRPve0OvCvvRQTLPVPSR0Q00S<LVHLPP 

homology piqck E » pred nuctcqida BO 

RCCCRCCGTGGGCCCTCRCRGCarr^-TCflCCTCCCGfiGGflTRGGflCRGTCflflflGRCRGCflCCCCRflGTTCTCrGGRCTCflGRTCCTCTGflTGGCCRTG 2800 

PTVGPHS ! °SPPEOBTVKOSTPSSLOS0Pinpn 
________________________________ homology Bock E • preg nucleotide BO 

CTGCTGRRRCTTCRRGRflGCTGCCRRCTPCRTTGRGTCTCCRGRTCGRGRRRCCflTCCTGGRCCCCRRCCTTCRGGCRRCRCT nRRGGGVTCGGCPRTC 2900 
LLCLOERP NVJCSP0RCTILOPNL0RTL | 

•vomgogy aacfc £ . piad oucfonoe BO ^ * 

PCTGrCRCCCCCGGRCRGCRGRRCGC7 3GCRTCRGCTRTCTTRGCTCCTCCTCTCCCCTCTCCTCTTTCflGRGCRC TGGCTC TCCflGCCCCRGGflGCRGR 3000 

3' urtqansmao trailer 

RCRGGRGGGR GGPGGRGPTGRPPGAGG^GGGRCRGGrrcrrGGTGCTGTRCCTTTGRGRRCTTCCTHGGRRGGRRTGGTGGGG TGGCG TT TGGGfiflC TTG 3100 

3' untranslated trailer 

TGCCCCCTflRR CRCRTTTPCTGCC:-:::: TRRTGRC TTTGCGGRRRRGRTGRTTCTCGGTC T7TCCCTTGRCTTC TTG TTTCRATTRCRRPC TCCTGCC 3200 

y untranslated trailer 

CTTTCTGGGGR GGGGTTCRGPPPOC = '.— RHRCRCTGCflGCRGTTCCTflflRrGPTTCTCPCPRGCRRCCCTGPGRGRGRCPGTCTTGTGRGGGRGRTCTG 3300 
________ _ __ _ ____________ 3' untranslated trader 

GCGCRGGCRGG RRGCTCC TCRGR TT 7"! TCRCRGRCCCT TCCCRRTTCCRTCfiCCRCTGCCRRCRflC TCCTCCCCCRGRGPTC TGG C T GGEGC CC RG RRR 3<»C0 

3' untranslated tracer 

RPGPPGCRrG TGGTTTPPBflRPTS"-=^PTCPPTCTG7RflPRGGTflRRRRTGRRHRRRCRflPRRCRRGCflRRCRRRCPPRPRPCflRTGGPflPPGRTGPa 3500 
___________ 3' untranslated natter 

GC TGGRGRGRGPSGRPCCPG rTC::-= SS'SGRGRGC TGCCCGC TCCTGCCCTCTGGRTGPCRTRGGGGRCP ICRRCRRGRCGGC TGCCPPCC TGRGRRG 3S00 

3' unoansuieq tuitcr 

TCRC;PRPCC PCPRPPRTPPCC"3C = :::rTCRGGGRRRGRCTRCCRGCTCTGTCTTTC TRCCC TC TRR T f TPRC HP TGCRCCGG RR TFCRGC * TGGRC 3700 

3' untranslated trader 

fTPPCC 37C5 
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Tuesday. 18 November 1997 10:33 

. Iu-Unc53/1 seq (1 > 6013) Site and Sequence 
fcnzymes : 60 ol 1 46 enzymes (Filtered) 

Settings: Linear. Certain Sites Only, Standard Genetic Code 



AH 



n 



Page / 



GATATCTGCAGAATTCGGCTTCTTTGAGCAAGTTCAGCCTGGTTAAGTCCAAGCTGAATrCCGGGGAAAGCCGAGCCGGATCCC TCGACGACCC T ATGC3 

CTATAGACGTCTTAA GCCGAAGAAACTCGTTCAAGTCGGACCAATTCAGGTTCGACTTAAGGCCCCTTTCGGCTCGGCCTAGGGAGCTCCTGGGATACG- 

> X ==3* ■ 



■pCR2.1 linker 



"lambda gtio primer EcoRI 



I SAEFGFFEQVOPG 



-suspect sequence linker? 



-pHHl4-3 — 



V Q A E F RGKPSR IPRGPYA 



GAGGTCAAGCCGCTCAGCAAGGCGCC1 



ITGAAGCGGCCGTGAGCGAAG ATGGCAAATCGGACGACGAGCTGCrCTCCAGCAAGGCCAAGGCGC AAAAGAGCT 

ctccagttcggcgagtcgttccgcggacttcgccggcactcgcttctaccgtttagcctgctgctcgacgagaggtcgtIccgg 200 



•PHH14-3 



£ V K P L S K A P E A A V S EDGKSDOELLSSKAK 



A Q K S 



CTGGGCCTGTCCCCTC 



TGCCAAGGGCCAGGAGGAGCGCGCCTTCCTCAAGGrGGACCCCGAGCTGGTGGTG 



GACCCGGACAGGGGAGACGGTTCCCGGTCCTCCTCGCGCGGAAGGAGTTCCACCT 



ACCGTGCTGGGAGACCTGGAGCAGCTGCT 



GGGGCTCGACCACCACTGGCACGACCCTCTGGACCTCGTCGACGA 



200 



SGPVPSAK G Q E E R 



-PHH14-3 



* F , L K V D P ELVVTVLGOLEQLL 



:ttcagccagatgctggacccagagtcccagagaaagaggacagtgcagaatgtcctgg 



atctccggcagaacctggaagagaccatgtccagcctgcga 



gaagtcggtctacgacctgggtctcagggtctctttc ^cctgtcacgtcttacaggacctagaggccgtcttggacct tctctggtacaggtcggacgct ^ 

~ — — PHH14-3 



-ORF (1-579hp) m pLM7 CRF 



Mull available ORF HU-Unc53/1 = pLMl OR 



F S Q M L 0 P E S Q R K R T y q N v L p L R q N L g £ T M S S L R 
GGGTCCCAGGTGAC TCACAGCTCCCTGGAGATGACCTGCfACGACAGCGATGATGCCAACCCACGCAGC CC TC 

cccagggtccactgagtgtcgagggacctctactggacgatgctgtcgcIactacggttgggtgcgtcgcacaggtcggagaggtt^^^ 



-PHH14-3 



■ ORr ( 1-57Sbp) = pLM7 CRF 



— full available opp HI Inc53/1 - pLMl on 

S 0 V T H S S L E n T C Y DSOOANPRSVSSLSNR 



3 S P 



tgtcatggcgctatggccagtccagtccgcggctgcaggctggtgacgcg ccctctgtgggtgggagc 
acagtaccgcgataccggtcaggtcaggcgccgacgtccgaccactgcgc 



TGCCGCTCGGAGCGGACGCCCGCCTGGTACAr 



GGGAGACACCCACCCTCGACGGCGAGCCTCCCCTGCGGGCGGACCATGTA 



-PHH14-3 



-ORF M-579bp) = pLM7 CRF 



"~ " fuJI available ORF Nt j.i ~ ? \ km op ... 

LSVRVGQSSPRtQA G 0 A P S V G G S C R 5 E G T P A V Y M 
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fig Hu>Unc53/1 seq (1 >6013) Site and Sequence 

GCACGGCGAACGGGCCCACTACTCC CACACCATGCCCATGCGCAGCCCCAGCAAGCTCAGCCATATCTCCCGCCTGGAGC TGG TCGAATCCC TGGA3TCG 

11 I ' I ■ I i I ■ I i | i i i | 7,- V "\ 

CGTGCCGCTTGCCCGGGTGATGAGGGTGTGGTACGGGTACGCGTCGGGGTCGTTCGAGTCGGTATAGAGGGCGGACCTCGACCAGCTTAGGGACCTGAGw ' * 



»pHH14-3 



-ORF (1-573 Op) m pLM7 ORF 



-full available ORF HU-Unc53/1 » pLM1 OR 



H G E R A H Y S H T M P M R S P S K L SH ISRLELVESL03 
GATGAGGTGGACCTCAAGTCCGGCTACATGAGCGACAGTGACCTCATGGGCAAGACCATGACGGAGGATGATGACATCACTACCGGCTGGGATGAAAGCA 

1 ■ ' 1 1 ' ' 1 i ' i ' » ■ i ■ i > i r ec,' 

CTACTCCACCTGGAGTTCAGGCCGATGTACTCGCTGTCACTGGAGTACCCGTTCTGGTACTGCCTCCTACTACTGTAGTGATGGCCGACCCTACTTTCG7 



■pHH14-3 



'ORF (1-579Qp) ^ plM7 ORF 



Mull available ORr HU-Unc53/1 = pLMl OR 



OEV QIKSG YM SD SOLMGKTMTEO00 i TTGVOES 

GCTCCATCAGTAGTGGACTCAGCGATGCCTCAGACAATC TCAGTTCAGAAGAATTCAATGCCAGCTCCTCACTCAACTCCCTCCC AAGTACTCCCACTGC 

' 1 ' 1 1 1 1 ' 1 I — — 1 I 1 ' ' t ' ■ t ■ I i ; 

CGAGGTAGTCATCACCTGAGTCGCTACGGAGTCTGTTAGAGTCAAGTCTTCTTAAGTTACGGTCGAGGAGTGAGTTGAGGGAGGGTTCATGAGGGTGACG ~ ' 



-PHH14-3 



-pCB212 



>, 



"ORF (1-57Gbp) = oLM7 ORF — 1 



-Ml available ORF HU-Unc53/i « pLMl OR 



3 S I S S G L S D A S 0 NL SSEEFNASSSLN SLPSTPTA 

TTCTCGCAGGAACTCAACAATAGTGCTACGCACAGACTCAG AGAAGCGCTCACTGGCAGAAAGTGGGCTGAGCTGGTTTAGTGAATCAGAGGAGAAAGCC 
AAoAGCGTCCTTGAGTTGTTATCACGATGCGTGTCTGAG TCTCTTCGCGAG rGACCGTCTTTCACCCGACTCGACCAAATCAC rTAGTCTCCTC 7TTCG2 



-pHHl4-3 



-PCB212 



-full available ORF HU-UncS3/1 = pLMl OR 



■5 R R N S T IVLRTOSEKRSLAESGLSVFSE 



SEEK 



CCTAAAAAACTGGAGTACGACAG TGGTAGCCTGAAGATGGAACCTGGGACfTCTAAGTGGCGGAG GGAGCGGCCTGAGAGCTGTGATGATTC ATCC AAGG 
GGATTrTTTGACCTCATGCrGTCACCArCGGACTTCTACCTTGGACCCTGAAGATTCACCGCCTCCCTCGCCGGACTCTCGACACrACTAAGTAGGTTCC 

pHHU-3 

-pCB212 ■ 



r:«: 



" — full available ORF HU-Unc53/1 = pLM1 OR — 

?KK LEYQS G5LKMEP3TSKVRRERPE3C003S* 
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tig Hu-Unc53/1 seq (1 > 6013) Site and Sequence J 



G.GGAGAACTGAAAAAGCCCATCAGCCTGGGCCACCCTGGTTCCCTGAAGAAGGGCAAGACCCCACCTGTGGCTGTAACTTCCCCCATCACTCACACA 



CACCTCTTGACTTTTTCGGGTAGTCGGACCCGGTGGGACCAAGGGACrTCTTCCCGTTCTGGCGTGGACACCGACATTGAAGGGGGTAGTGAGTGTGTCS " 



-pHH14-3 



-PCB212 



— full available ORF HU-Unc53/l » pLM1 OR - 

GGELKKP l SLGHPGSLKKGKTPP 



VAVTSP ITHTA 



CCAGAGTGCCCTCAAAGTCGCAGGCAAACCTGAGGGCAAAGCTACAGACAAGG GTAAGCrTGCAGTGAAGAATACTGGGCTCCAACGCTCCTCCTCTGAT 

CCCGAGGTTGCGAGGAGGAGACTA 



GGTCTCACGGGAGrTTCAGCGTCCGTTTGGACTCCCGTTTCGATGTCTGTTCCCATTCGAACGTCACTTCTTATGACC ' "~ *" 



-pHHl4-3 



■pCB212 



-full available ORF HU-UncS3/1 = pLMl OR 



Q S A L < V A G K P £ G K A T 0 K GKLAVKNTGLQRSSSD 



GCTGGTCGGGACCGCCTGAGTGATGCTAAGAAGCCCCCC yCGGGCATTGCTCGC CCCTCCACTTCGGGATCCTTTGGCTACAAGAAGCCTCCTCCTGCCM 

CGATGTTCTTCGGAGGAGGACGGT 



CGACC AGCCCTGGCGGACTCACTACGATTCTTCGGGGGGAGCCCGTAACGAGCGGGGAGGTGAAGCCCTAGGAAACCG * ' ~ ' ^ 



•pHH14-3 



-pCB212 



-full available ORF HU-Unc53/1 = pLMl OR 



A G R , D R L S P / * < P PSGIARPSTSGSFGYKKPPPA 



CAGGCACAGCCACrGTCATGCAAACTGGTGGTTCAGCCACTCrCAGCAAGATCCAGAAGTCCTCAGGCATCCCTGTCAAGCCAGTAAATGGGCGCAAGA: 



G7CCGTGTCGGTGACAGTACGTTTGACCACCAAGTCGGTGAGAGTCGTTCTAGGTCTTCAGGAGTCCGTAGGGACAGTTCGGTCATTTACCCGCGTTCT2 * W ' 



-PHH14-3 



■PCB212 



* " " full available ORF HU-Unc53/1 = pLM1 OR — ■ - 

T G T A T V M Q TGGSATLSKIOKSSGIPVKP 



T^5C fTAGATGTTTCCAACAG TGCAGAGCC AGGATTCCTGGC TCCTGGAGCCCGTTC TAACA TCCAGTACCGC AGCCTGCCCCGGCC AGC C AAG T C AAG ' 
ArCGAATCTACAAAGGTTGrCACGTCTCGGTCCTAAGGACCGAGGACCTCGGGCAAGATTGTAGGTCATGGCGTCGGACGGGGCCGGTCGGTTCAGTTCK 



•pHHU-3 
-DCB212 



' ' full available ORF HU-Unc53/1 = pLMl OR — 

3 L 0 V S W S A E P G F L A P G A R S N I 0 Y R S L P R P A X S 3 



WO 98/24810 148/270 PCT/EP97/06956 
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liS. Hu-Unc53/1 seq (1 > 6013) Site and Sequence 
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rrArGAGCGTGACCGGCGGGCGGGGTGGACCTCGCCCTGTGAGCAGCAGCATTGACCCCAG 



TC TCCTCAGCACCAAGCACGGAGGCCTTACGCCTTCC-3 



aga tactcgcac tggccgcccgccccacctggagcgggacac tcgtcgtcgtaactggggtc agaggag tcgtggttcgtccctccggaatgcggaagg"* ' 7 °' : 



-pHH14-3 



-pCB212 



-pCB210-14 



-full availabJe ORF HU-UncS3/1 a pLM1 OR 



S M S V T G G RGGPRPVSSSID 



- rev primer HU53rv4 



p S L LSTKQGGLTPS 



GACTGAAGGAGCCrACCAAGGTAGCCAGTGGGCGGACCACTCCAGCCCCTGTCAATCAGACAGATCGGGAAAAGGAGAAGGCC AAAGC 

ctgacttcctcggatggttccatcggtcacccgcctggtgaggtcggggacagttagtctgtctagcccItttcctcttccggtttcg^ ,atX 



•PCB212 



4 



-PHH14-3 



-PCB210-14 



A K A K A V A 



■ lull available ORF HU-Unc53/l = pLMl OR 

R L K E P T K V A 5 G R T T p A P V N Q T 0 R £ K E K 
CrTGGACTCAGACAACATCTCCTTGAAGAGT ATTGGCTCCCCAGAAAGrACTCCCAAGAACCAAGCAAGCCACCCCACAGC 

GAACC IGAGTCTGTTG TAGAGGAACTTC TCATAACCGAGGGGTCTTTCATGAGCGTTCTTGOTTCGTTCGG TGGGGTGTCGGTGGTTCGACrifiTr Trr« a- '** 




LDS QN I SLKS I GS PESTPKNQA3HPTAT 



K L A £ L 



CCACCAACCCCTCTCAGGGCCACAGCGAAGAGCTTTGTCAAACCACCCTCACTAGCC AATC T TG AC 

GGTGGTTGGGGAGAGTCCCGGTG TCGCTTCTCGAAAC AG TTTGGTGGGAGTGATCGGTTAGAACTG TTCCAGT TGAGGTTGTr AfiArr TAnATrr.T ai-t). 



AAGGTCAAC TCCAACAG TC TGGATC TACCA TCAT 



-pHH14-3 



•pCB210-14 



~" fuI1 available ORP Mi m m^i/i - pt ».u ^ 

° ? T P L R A T A K 5 F V« P P S L A N L 0 K V N S N S L D L P S 



WO 98/24810 149/270 PCT/EP97/06956 



Tuesday, 18 November 1997 10 33 <*" „ Qt 

l^t 7 /> Pages 



.Hu>UncS3/1 seq (1 > 6013) Site and Sequence 

CCAGTGATACCACCCATGCTTCAAAGGTCCCAGATCTGCATGCTACAAGCTCAGCATCTGGGGGCCCTCTCCCTTCCTCCT TC AC CCCC AG TCC GGCACC 

ggtcactatggtgggtacgaagtttccagggtctagacgIacgatgttcgagtcgtagacccccgggagagggaaggacgaagtgggggtcaggccg 2IC ' 



-pHHl4-3 



•pCS210»14 



-full available ORF HU-Unc53/l a pLMl OR 



S QTTHA SKV PQLHAT SSASGGPLPSCFTP 



S P A P 



CATCCTCAATATTAACTCAGCCAGCTTCTCCCAGGGCCTGGAGCTAATGAGTGGTTTCAGTGTGCCAAAAGAGACCCGCATGTACCCCAAArTr 



TCAGGC 



gtaggagttataattgagtcggtcgaagagggtcccggacctcgattactcaccaaagtcacacggttttctctgggcgIacatggggtItgagagtccg 220 



-pHHU-3 



-pCB2l0-14 



• -full available rtoc t^i i_i - pLf 1t on 

1 L N . ! N S , A 5 F S Q G L E L M S G F S V P K E TRMYPKLSG 



ctgcacaggagcatggagtccctccagatgccaatgagcctccccagtgccttccccagcag tactcccgtccccaccccacctgctccccctgctgctc 

CACGTGTCCTCGTACCrCAGGGAGGTC^^ 230C 
— pHH14-3 



-pCB2l0-14 



■full available ORF HU-Unc53/1 =pLM1 OR 



LHB SMj-SlQM PMS LPSAFPS STPV P TPPAPPAA 

CCACAGAAGAAGAGACGGAAGAGCTGAC TTGGAGTGGAAGCCCCAGAGCTGGGCAACTGGACAGTAATCAGCGGGATCGGAACAC TCTTCC C AAGAAAG3 
GGTGTCTTCTTC TCTGCCT TCTCGAC TGAACCTCACCTTCGGGGTCTCGACCCG TTGACC TG TCATTAGTCGCCCTAGCCTTGTGAGAAGGGTTCTTTCC 

"~ " pHH14-3 - . 



-pHH3b 



■pCB210-14 I £ 



•full available ORF HU-Unc53/i = pLMl OR 



I rev primer KUS3rv3 1 I rev pr^r HUS-rvZ — TZl 



peptide 672628H 

' E E E T5 EL T VS G S PRAGQLDSNGRQRNTLP 



K K G 



WO 98/24810 150/270 PCT/EP97/06956 
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GCTCAGGTACCA GCTTCAGTCCCAGGAGGAGACCAAGGAGAGGCGACATTCCCATACCArTGGTGGGCTGCCTGAATCCG ATGACCAGTCAGAGCTGCCT 
CGAGTCCATGGTCGAAGTCAGGGTCC TCCTCTGGTTCCTCTCCGCTGTAAGGGTATGGTAACCACCCGACGGACTTAGGCTA C TGGTCAGrCTCGA-rr«r.l 



-pHHl4-3 



- pHH3b 



-fey primer HU53fvl 



-full available ORF HU-UncS3/t = P LM1 OR 



LRYQLQSOEETKERRH 



S H T | G G LPESOOQSEUP 



TCTCCCCC 



TGCACTTCCCATGTCTC ^^^|^^ AAA QQ^CCAACTTACCAACATAGTGA6TCCCACTGCGGCCACCACGCCAAGAATCACCCGCTCCAAC«S 



AGAGGGGGACGTGAAGGGTACAGAGACTCACGTTTCCCGGTTGAATGG 



TTGTATCACTCAGGGTGACGCCGGTGGTGCGGTTCTTAG TGGGCGA 




GCATCCCCACCCACGAGGCGGCCTrCGAGCTGTAC AGCGGCTCCCAAATGGGGAGCACCCTG TCCCTGGCCGAGAGACCCAAGGGA ATGATTCGGTCAGG 
CGTAGGGGTGGGTGCTCCGCCGGAAGCTCGACATGTCGCCGAGGGTTTACCCCTCGTGGGACAGGGACCGGCTCTCTGGGTTCCCTTACT 




^CCTTCCGAGACCCCACGGACGATGTTCACGGCTCAGTGCTGTCCCTGGCCTCCAGTGCCTCCTCCACCTAr 



TAGGAAGGCTCTGGGGTGCCTGCTACAAGTGCCGAGTCACGACAGGGACCGGAGGTCACGGAGGAGGTGGATGAG^ 



TCCTCAGCTGAGGAGAGGATGCAATCT 



•GTCGACTCCTCTCCTACGTTA.jA 




3FR0PT0D 



v H G 5 V L SLASSASSTYS 



3AEERMQS 



GAGCAAATC 



TCCGGAAGCTTCGTAGGGAAC tggaatc ATCCCAGGAAAAAGTGGCC, 



ACCTTGACGTCTCAGCTTTCTGCCAATGCTAATCTGGTGGCTGCTT 



CTC6rTT.G G cCTTCGAAGCATCCCTTGACCTTAGTAGCGTCCTTTTTCACCGGT G GAACTGCAGA G TrL,.r. fl rr.,-. T :.rr. 



AT TAGACC ACCGACGAm 




L ft R E 



-full available ORF HU-Unc53/1 = pLM1 OR 
^- E 3 S Q EKVATLFSOL 



SANANLVAA 



WO 98/24810 



151/270 



PCT/EP97/06956 
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tgagcagagcctggtgaatatgacatccc gcctgcgacacctggcagagacggccgaggagaaggacactgagc tgc TGGATTTG CG.AGAAACCAT^GA 

^ilL— ?^^^^^ A ^^ A ^ A ^ TGTAGGGCGGACGC TG TGGACCGTC TCTGCCGGCTCCTCTTCC tGrGACTCGACGACCTAAACGCTCrTTCGrTTC T' 




' E ° $ L V " " T S R L * " t A E T A E E K 0 T E U L D L R E T , p 

CTTTCTGAAGAAAAAGAACTCTGAGGCCCAG GCAGTCATTCAGGGAGCCCTTAATGCCTCAGAAACCACACCCAAAGAACTTC GGATCAAGAGAC^ 
GAAAGACTTCTTTrTCTTGAGACTCCGGGTCCG TCAGTAAGTCCCTCGGGAATTACGGAGTCTTTGGTGTCGGTTTCTTGAAGCCTAGTTCTCTGTTT T.: ^ 




• — nw-unraon = purvn OH 

■ S ° . S ' S S 1 " S ' T S .H S S . G 8 S K D A D A K K K K K K S V 




-full available ORF HU-Unc53/l = pLMl OR ■■• 

" Y . ' V * S S - F " ' A F S ! ; * G P K S A 5 S V S D 1 E E , A T p D 




WO 98/24810 152/270 PCT/EP97/06956 



Tuesday, 18 November 1997 10:33 C p tf 

«F Hu-UncS3/1 seq (1>6013) Site and Sequence ^ " ' age * 

ACCG .GCCCTGCTCACCCA6CCCCCC ACACTAGGCTGTTCCATGCAAATGAGGAGGAGGAGCCAGAGAAGAAGGAGGTA rCGGAGCTGCGCTCTGAGl' 
TGGCTCCCGGGACGAGTGGGTCGGGGGGTGTGATCCGACAAGGTACGTTTACTCCTCCTCC TCGGTCTC WcTTCCTCCATAGCCTCGACGCGAGACTCS ^ 



•U2 CRF = pC62S1 ORF 



-pHH3b 



• lull available ORF HU-UncS3/1 = pLM1 OR 



TEGPAHPAPHTRLFHANEEEEPE 



KKEVSELRSE 



TATGGGAGAAGGAAATGAAGCTTACAGACATCCGCTTGGAGGCCCTCAACTCTGCCCACCAACTGGATCAGCTTCGG GAGACCATGCACAACArGCAGTT 
ATACCCTCTTCCTTTACTTCGAATGTCTGTAGGCGAACCTCCGGGAGTTGAGACGGGTGGTTGACCTAGTCGAAGCCCTCTGGTACGTGTTGTACGTCAA ^ 



* U2 ORF n PCB251 ORF 



-pHH3b 



•peptide B72S27H 



-full available ORF HU-Unc53/1 = pLM1 OR 



-U3 ORF«pLM5 ORF 



LV EKE MK L T01 R L E A I N SAH QL DQLRE T M H N M 0 L 

GGAGGTGGACCTGC TGAAAGCAGAGAATGACCGACTGAAGGTAGCCCCAGGCCCCTCATCAGGC TCCACTCCAGGGCAGGTCCCTGG ATCATCTGCATTA 
CCTCCACCTGGACGACTTTCGTCTCTTACTGGCTGACTTCCATCGGGGTCCGGGGAGTAGTCCGAGGTGAGGTCCCGTCCAGGGACCTAGTAGACGTAi 



+ 37CV 



-U2 ORF = pCB25l ORF 



-pHH3b 



-lull available ORF HU-Unc53/1 =; pLMl OR 



EVOLLKAENDR 



-U3 ORF = pLM5 ORF 



LKVA, p GPSSGSTPGQ -yPGSSAL 



TCTTCCCCACGCCGCTCCCTAGGCCTGGCACTCACCCATTCCTTCGGCCCCAGTCTTGCAGACACAGACCrGTCACCCATGGATGr^ATr^ 



TACTTGTG 



XGAAGGGGTGCGGCGAGGGATCCGGACCGTGAGTQGG TAAGGAAGCCGGGGTCAGAACGTCTGTGTC TGGACAGTGGG TACC TACCCTAuTC ATGAACAL' 



-U2GRF = pC62S! ORr 



-pHH3b 



•?ull available ORF HU-Unc53/1 a dLMI OR 



U3 ORF = pLM5 ORF 

S S P - R R S L G I A , >■ T H S F G P S L A 0 T 0 L S P M 0 G I S T C 



WO 98/24810 



153/270 
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GTCCAAAGGAGGAAGTGACCCTCCGGGTGGTGGTGAGGArGCCCCCGCAGCACATCATCAAAGGGGA CTTGAAGCAGCAGGAATTCTTCCTGGGCT3TAG 
C AGGTTTCCTCCTTCAC TGGGAGGCCCACCACCACTCCTACCGGGGCGTCGrG TAGTAGTrTCCCCTGAACTTCGTCGTCCTTAAGAAGGACCCGACATC 



-U2 0RF=pC62S1 ORF 



■pHH3b 



-full available ORF HU-UncS3/1 = pLMl OR 



•U3 ORF = pLM5 ORF 



G p KEEVTLftVVVRMPP*OHl 



IKGDLKQQEFFLGC 



CAAGGTCAGTGGAAAAGTTGACTGGAAGATGCTGGATGAAGCTGTTTTCCAAGTGTTCAAGGACTATATTTCTAAAATGGACCCAGCC TCTACCCTGGGA 
.^ZT_^£ AG T ^* AC ^^^^^^ AA ^^^ A( * C ^^CTACGACCTACTTCGACAAAAGGTTCACAAGTTCC 1"GATATAAAGATTTTACCTGGGTCGGAGATGGGACCCT 



-U2 ORF = pCBSSl ORF 



-pHH3b 



- full available ORF HU Unc53/1 = pLMl OR 



•U3 ORF = pLM5 ORF 



K V S G K V 0 V K M L D E A V F Q V F K 0 Y ISKMQPASTLG 



CTAAGCACTC 



TGAGTCCATCCATGGCTACAGCATCAGCCACGTGAAACGAGTGTTGGATGCAGAGCCCCCCGAGATGCCTCCTTGCC GTCGAGGTGTC AATA 
GATTCGTGACTCAGGTAGG ^CCGATGTCGTAGTCGGTGCACTTTGCTCACAACCTACGrCTCGGGGGGCTCTACGGAGGAACGGCAGC TCCAcifiTTAT 



tic*: 



•U2CRF a pCBiSl ORF 



-pHH3b 




■U3 ORF = pLM5 ORF 



LSiesiHGYSl 



PHH15 - 

3 H VKRVLD'AEPPENPPCRP 



G V N 



AC ATA FCAG TCTCCC TCAA AGGTCT3AAGGAGAAA TGCGTCGACAGCC TGG TGTTCGAGACGCTGATCCCCAAGCCGATGATGCAGC AC T AC ATAA3CCT 
TG FATAG TCAGAGGGAGTT TCCAGAC TTCC TC TT 7 ACGC AGC TGTCGGACC ACAAGC TC TGCGAC TAGGGG TTCGGCTAC TACG TCG TGATG TATT^Gsl 



- U2 ORF n pCB2 ; i i ORF 



■pHH3b 




•full available ORF HU Unc53/1 = p i.M1 OR 
^U3 ORF = pLM5 ORF 



H I 3 



S L * G L K E < 



PHH15 — 

VOSLVFETL I P K P 



M M 0 H Y I i> L 



WO 98/24810 



154/270 
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fig ^u-UncS3/1 seq (1 > 6013) Site and Sequence 
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CC I vlC TGAAGCACCGGCGCCTCGTCCTCTCGCGCCCC AGCGGCACGGGCAAGACCTACCTGACCAATCGCTTG GCCGAGTACCTG3TGGAGCGCTC TGGtl 
GGACGACTT CGTGGCCGCGGAGCAGGAGAGCCCGGGGTCGCCGTGCCCGTTCTGGATGGACT GGTTAGCGAACCGGCTCATGGACCACCTCGC3AGACC-; ^ 



-U2 OP= = pCB2S? ORF 



-pHH3b 



-U4 ORF = pCB201 ORF 



-full available ORF HU-Unc53/1 = pLM1 OR 
^3 ORF = pLM5 ORF 



-pHHIS 



L L K H R R L V L S G P S G T G K T Y L T N R L A EYLVERSG 

CGTGAGGTCACAGAGGGCA|TCGTCAGCACCTTCAACATGCACCAGC AG TCTTGCAAGGATCTGCAACTGTATC TTTCCAACC TAGCCAACCAGATAGACC 
GCACTCCAGTGTCTCCCGTAGCAGTCGTGGAAGTTGTACGTGGTCGTCAGAACGTTCCTAGACGTTGACATAGAAAGGTTGGATCGGTTGGTCTATCTGG ^ 



-U2 0RF = pCB261 ORF 



-pHH3b 



-U40RF = pCB201 ORF 




-pHH15 

E I , T E G [ V 5 T F NMHQQSCKOLQL V L S N L A W 0 ! [) 



GG5AAAC AGGAATTGGGGATG TGCCCCrGGTGATTCTATTGGATGACC TGAGTGAAGC AGGC TCCATCAGTGAGTTGGTC AATGGGG CCCTC ACCTGC AA 
CCC TTTGTCC TTaACCCCTACACGGGGACCAC TAAGAT AACC TAC TGGAC TCACTTCG TCCGAGGTAGTCACTCAACCAG TTACCCCGGGAGTGGACGTT 



-U2 ORF is pCBJ?S1 ORF 




-full available ORF HU-Unc53/1 s pLM1 OR 




R ^TGIG0VPLVILL0O 



LSEAGS1SEL VNGALTC* 



WO 98/24810 155/270 PCT/EP97/06956 
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*>q H u-UncSS/l seq (1 > 6013) Site and Sequence 



^5 % 
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G fATC A . AAA TQ TCCC TATATTATAGCTACCACCAATCAGCC FGTAAAAATGACACCCAACC ATGGC T TGC AC TTGAGCTTC AGGATGTTG ACC TJZ'CZ 
CArAGTATTTA CAGGGATArAATArCCATGGTGGTTAGT CGGACATTTTTACTGTGGGTTGGrACCGAACGTGAAC KGAAGTCCTACAAC TGu AA^^u" ^ 



-U2 0RF = pCBl > S1 ORF 



-pHH3b 




-peptide E72C20H 



Y H K C P Y I I GTT NQ PV KMTPNHGLHLSFRML T F o 

AACAACGTGGAGCCAGCCAATGGCrTCCTGGTTCGTTACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCG ACAKAATGCCAACAAGGAAGAGCTGCTT: 
TTG rTGCACCTCGGT CGGT TACCGAAGGACCAAGCAATGGACTCCTCCTTCGACCATC TCAGTCTGTCGCTGTAGTTACGGTTGTTCCTTCTCGACGAAG 



- U2 ORF = pCB25l OR? * 




-pHH15 



N N V £ P a M G F L V R Y L R R K L V E S 0 S 0 IN A N K E E L L 

GGGTGCTCGACTGGGTACCCAAGCTG FGG T A TC A TC TCC AC ACC T TCC TTG AG AAGC AC AGC AC C TC AG AC TTCCTC A TCCGCCC TTGC T TC TTTC TGTl" 
CCCACGAGCTGACCCATGGGrTCGACACCATAGTAGAGGTGTGGAAGGAACrCTTCGTGTCGTGGAGTCTGAAGGAGTAGCCGGGAACGAAGAAAGAC^a 



-U2 ORF« pCB2S1 ORF 




' — pHH15 

P V L D V V P k L VYHLHTFLEKHSTSOFL-GPCFFLS 



WO 98/24810 



156/270 



PCT/EP97/06956 
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fig .ij»Unc53/1 seq (1 >6013) Site and Sequence 
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GTUTCCCA7TGGCATTGAGGACTTCCGGACCTGGTTCAT TGACC TG TGGAACAACTC TAT CA TTCCCTAT3 TACAGGAAGGAQCCAJ.GGArGG-2 AT AAAU 



-U2 ORF = pCB2S1 ORF 



-pHH3b 



-U4 ORF = pCB201 ORF 



-full available ORF HU-UncS3M « pLMl OR 



-U3 ORF a pLMS ORF 



-pHHl5 



CP[ GIEOFRTWFIOLVNNS[ IPYLQEGAk'OG \ \ 

I ■ ■ i I ■ ■ A , ■ ■ I , ,. 1 . . . . . . ■ • , 

GTCCATGGAC AGAAAGC TGCTTGGGAGGACCCAGTGGAA TGGGTCCGGGAC ACACTTCCC TGGCCATCAGCCCAACAAGACC AATCAAAGC TGTACCAC^ 
CAGGTACCTGTCTTTCGACGAACCCTCCTGGGTCACCTTACCCAGGCCCTGTGTGAAGGGACCGGTAGTCGGGTTGTTCTGGTTAGTTTCGACATGG'GG 



-U2 0RFbpCB2S1 ORF 



- pHH3b 



— U4 ORF = pCB201 ORF 



-full available ORF HU-Unc53/1 = pLM1 OR 
"^TjS ORF = pLM5 ORF 



-PHH15 



' H 3 0 * A A v E D P V E V VRDTLPVPSAOQDOSICLYH 

■GUCCCCACCCACCGTGGGCCCTCACAGCATTGCCTCACCTCCCGAGGATAGGACAGTCAAAGA CAGCACCCCAAGTTCTCTGGACTCAGATCC'CTG.^' 
ACGGG3GTGGGTGGCA:CCGGGAGTGTCGTAACGGAGTGGAGGGCTCCTATCCTGTCAGrTTCTGTCG7G3GGrTCAAGAGACCTGAG7CTAG3AGACT^ 



- U2 ORF pC&;:S! ORF 



-pHH3b 



-U4ORF = pCB201 ORF 



-full available ORF HU-UncS3,M = pLMl OR 



-U3 ORF = pLM5 ORF 



WO 98/24810 157/270 PCT/EP97/06956 
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-iu-UncSa/l seq (1 > 6013) Site and Sequence V _ ^ 

• ggccatgctgctgaaac ttcaagaagctgccaac tac at tgagtctccagatcgagaaaccatcc fg gacccc aaccttc aggcaacac tttaagggt tc 

CCGGTACGACGACTTTGAAGTTCTTCGACGGrTGATGTAACTCAGAGGTCTAGCTCTTTGGTAGGACCTGGGGTTGGAAGTCCGTTGTGAAATTCCCAAu ^ 



>, 



• U2 CRF s pC5251 ORF ~ I 



-pHH3b 



>. 



- U4 ORF = pCB201 ORF — ' I 



•full available ORF HU-Unc53/1 = pLM1 OR — ' 1 



-U3 ORF = pLMS ORF 



-pHHIS 



-papteo 372625H 



G F 



AML LKL QEAANYI ESP O RETILOPNLQATL. 

GGCAATCACTGTCACCCCCGGACAGCAGAACGCTGGCATCAGCTATCTTAGCTCCTCCTC TCCCCTCTCCTCTTTCAGAGCAC TGGCTC TCCAGCCCCA3 
CCGTTAGTGACAGTGGGGGCCTGTCGTCTTGCGACCGTAGTCGATAGAATCGAGGAGGAGAGGGGAGAGGAGAAAGTCTCGTGACCGAGAGGTCGGGGTC ^ 



-pHH3b 



" — PHH15 

gnhchprtae rvhqls lllsplufqs tgspap 



GAGGAGAACAGGAGGGAGGAGGAGATGAAAGAGGAGGGACAGGTTCTTGGTGCTGTACCTTTGAGAACTTCCTAGGAAGGAATGGTGG GGTGGCGrtTGG 

ctcctcttgtcctccctcctcctctactttctcctccctgtccaagaaccacgacatggaaactcttgaaggatccttccttaccaccccaccgcaaac: 



-pHH3b 



•PHH15 



G E Q £ G G GDERGGTGSVCCTFENF 



LGRNGGVAFG 



GAACrTGTGCCCCCTAAAC ^CAf TTACTGGCC TCC TCTAATGAC TTTGGGGAAAAGATGATTCTGGGTCTT TCCC TTGAC TTC TTGTTTC AA TTAC AAA:.' 
CTTGAAC ACGGGGGAT TTG TGTAAATGACCGGAGGAGATTAC TGAAACCCC TTTTC TACTAAGACCCAGAAAGGGAAC TGAAGAACAAAGT TAATG T"TTo "'^ 



-pHH3b 



■PHH15 



N L C , p L NTFTGLL..LWGKOOSGSFP 



L L V S I T N 



TCCTGGGCTTTCTGGGGAGGGGTTCAGAAAACATCAAAACACTGCAGCAGTTCCTAAArGATrCTCACAAGCAACCCTGAGAGAGACAGTCT TG TG AGGu 

aggacccgaaagacccctccccaagtcttttgtagttttgtgacgtcgtc AAGGATTTACTAAGAGTGTTCGTTGGGACTCTCTC TG TCAGAAC AC TC rr 

- — — — — 1, 

— pHH3b 1 



" - pHHIS Z 

S W A F V G G V C K T 3 K H C S S S . M I L T S N P £ S 0 S L V P 



WO 98/24810 158/270 PCT/EP97/06956 



Tuesday. 18 November 1997 10:34 L^i C ^fjj? a L 

fig Hu-Unc53/1 seq ( 1 > 601 3) Site and Sequence [ J Page * T 

AGATC rGGGGGAGGCAGGAAGCTCCTCAGATTTTC TC ACAGACCCTTCCCAATTCCATCACC ACTGCCAAC AAC7CCTCCCC CAGAGATCTGGC TGGAGC 
TCTAGACCCCCTCCGTCCTTCG AGGAGTCTAAAAGAG TG TCTGGGAAGGGTTAAGGTAGTGG rGACGGTTGTTGAGGAGGGGGTC fC TAGA'CG ACCT'"* : * ,7: ' 

1. 



pHHIS — I 

L A G A 



ElVGRQE APQ tFSQTLP NS ITTANNSSPRO 



CCAGAAAAAGAAGCATGTGGTTTAAAAAATGTTTAAATCAATCTGTAAAAGGTAAAAATGAAAAAACAAAAACAAGCAAACAAACAAAAAACAAT G6AAA 
GGTCTTTTTCTTCGTACACCAAATTTTTTACAAATTTAGTTAGACATTTTCCATTTTTACTTTTTTGTTTTTGTTCGrTTGTTTGTTTTTTGTTACCTTT 
Q KK KHV V . K M F K S I C * K • * . K N K N K Q T M K K QV , 

AGATGAAGCTGGAGAGAGAGGAACC AGTTGCCAAGGTAGAGAGCTGCCCGCTCCTGCCCTC rGGATGACATAGGGGACATCAACAAG ACGGCTGCCAAC- 

TCTACTTCGACCTCTCTCTCCTTGGTCAACGGTTCCATCTCTCGACGGGCGAGGACGGGAGACCTACTGTATCCCCTGTAGTTGTTCTGCCGACGGTT<;V ^ 
" S V R E R N Q L P R ■ RAARSCP L DP I G D I N K r A A M 

TGAGAAGTCACCAAACCACAAAAATAACCTTACAGCCTTCAGGGAAAGACTACCAGCTCTGTCTTTCTACCCTCTAATTTAACAATGCACCGGAATTC A3 
ACTCTTCAGTGGTTTGGTGTTTTTATTGGAATGTCGGAAGTCCCTTTCTGAfGGTCGAGAC AG AAAGATGGGAGATTAAATTGTTACGTGGCC TTAAGTC *** 



•linker? - 

L * . S H Q T T . K 1 T > Q P S G K D V Q L C L S T L . F N N A P C F o 

CTTGGACTTAACC 

' I - 6013 

GAACCTGAATTGG 



3 



— linker? 
L D L T 
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GGCACGAGGCA 7CCTCTG7GGGCACCGAGGTCACCGAGACCCCTGCTCATTCAGTCCCCCACACTAGACT 70 
| linker 7 \ open reading frame 

PHEASSVGTEVTETPAHSVPHTRL 

GTTCCAAGCCAATGAAGAGGAGGAGCCAGAGAAGAAGGAGGTATCAGAACTGCGCTCTGAACTATGGGAA \HO 
open reading frame 

FQANEEEEPEKKEVSELRSELVC 

AAAGAGATGAAGCTCACGGATATCCGGTTGGAGGCCCTCAACTCTGCCCACCAGCTGGACCAGCTTCGGG 210 
open reading frame 

KEMKL TDIRLEALNSAHOLDOLR 

AGACCATGCACAATATGCAGTTGGAGGTGGACCTGCTGAAAGCAGAGAATGACCGGCTGAAGGTTGCCCC 250 
open reading frame 

E THHNHQLE VOLLK A £ N D R u < V A P 

CGGCCCC TCCTCAGGCTGCACTCCAGGGCAGGTCCCTGGG TCATCGGC TC TG TCGTCCCCTCGACG7 TCC 350 
open reading frame 

GPSSGC TPGOVPGSSAiSSPR^S 

CTGGGCCTTGCACTCAGCCATCCTTTCAGTCCTAGTCTCACAGACACAGACCTCTCACCCATGG^'GGCA *20 
open reading frame 

lglalshpfspsl rorc L spm:g 

TCAGCACCTGTGGTTCAAAGGAAGAGGTGACCCTGCGGGTGGTGGTCCGGATGCCGCCCCAGCACATCAT ^90 
open reading frame 

istcgskeevt-. rvvvrmps> :h: ! 
caaaggggacttaaagcagcaggagttcttcc:gggttgcagca^gg:c^grggcaaagt'3act:-:aag zir'j 

__ open reading frame ^ 

KGOlKOOEFTlGCSv VSGkvO.k 

ATGC TGGA TGAAGCCGT T TTCC AAG TGT TC AAG jACTACATTTC" ^AAA'GGaCCC agcc tcaai zc tgg cjj 
. open reading frame 

MLOE AVFOVT. 3Y | SkMQr A $ - 

GACTGAGCACTGAGTCCATACATGGCTATAGCC'CAGCCACGTGAAACG^GTGCTGGA TGC fGAGICCCC *x 
open reading frame 

CLSfESlHGY5uSHV<f;v < .0At = c> 




WO 98/24810 



160/270 
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AGAGATGCCTCCTTGCCGCCGAGGTGTCAA fAACATATCAGTCGC TC7CAAAGG7C TGAAAGAGAAG7GT 770 



EMPPCRRGVNN! SVAL K G L K E K C 

GTCGACAGCCTGGTGTTCGAGACGCTTATCCCCAAGCCCArGATGCAGCACTACATCAGCCTCCTGCTCA 8^0 

open reading frame 

VOSLVFETLIPKPMMOHY \ S L L L 

AGCACCGGCGCCTGGTGCTCTCCGGCCCCAGTGGCACCGGCAAGACCTACTTGACCAATCGGCTAGCCGA 910 

open reading frame 

KHRRLVl SGPSG7GKTYI T N 3 L A £ 

G7ACCTGGTGGAGCGCTCCGGCCGCGAGGTCACGGATGGCATCGTCAGCACTTTCAACATGCACCAGCAG S3D 

open reading (rame 

VLVCaSSREVTDGIVSTFNMHOO 

7CTTGCAAGGATCTGCAACTGTACCTCTCCAACCTAGCCAACCAGATAGACCGGGAAACAGGGATAGGGG *C50 

open reading frame 

SCKOuOLYLSNLANO I D R E T 3 ! 3 

ATGTGCCCTTGG7GATCCTCCTGGATGATCTGAGTGAAGCAGGCTCCATC^GrGAGC:GGTCAATGGGGC '^0 
open reading frame 

OVPLV I LLOOLSEAGS I S E L V N G A 

CC TCACC TGCAAG T ATCAC AAA TGTCCC TACATTATAGGTACCACCAA TC AGCC TG 7AAAAA TGAC AC CC : 5 HO 
open reading frame 

LTCKYHKCPYi IGTTNOP / < M 7 P 

AACCATGGCTTGCACrTGAGCTTCAGGATGCTGACCTTCTCGAACAATGTGGAACC^GCC^ATGGCTT^C '160 
open reading frame 

M H G L Ht SFRML TFSNNVE PANG " 

TGGTCCGTTACCTGCGGAGGAAGTTGG7AGAG7CAGACAG fGACG TCAA T GC T AAC A^GG w IGAGC T 3 Z T ' jj? 
open reading frame ^ 

uVR YL "RKLVlSDSOVNA*|. -: r 

■CGGGTSCTGGACrGGGTGCCCAAGCTGTGGrATCACCTCCACACCrrcCTGGAGAAGCACAGCACCTCG ' »\ '0 
open reading frame 

R V L 0 ./ V 0 ► L V^HLHTFLE - hS : i 



open reading frame 
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GACTTCCTCATTGGCCCTTGCTTCrTCCTGTCCTGTCCCATrGGCATCGAGGACTTCCGGACCTGGTTCA 

open reading frame 

0 F L I GPCFrLSCP t G I E 0 F R T V F 

TTGACCTGTGGAACAATTCCATCATCCCCTATCTACAGGAAGGAGCCAAGGATGGGATCAAGGTTCATGG 15^0 

open reading frame 

I0LVNNSI 2PYL0EGAKDG I K V H G 

ACAGAAAGCTGCTTGGGAAGACCCGGTGGAATGGGfCCGAGACACTCTTCCCrGGCCGTCGGCCCAACAA 1610 
open reading frame 

OXAAVEOPVEVVROTLPVPSAOO 

GACCAATCAAAGCTC7ACCACCTGCCCCCGCCT7CTGTGGGCCCCCACAGCACTGCCTCACCCCCGGAGG :630 
open reading frame 

OOSKLYHL^ppsvGPHSTASPPE 
ACAGGACAGTCAAAGACAGCACTCCAAACTCCCTCGACTCAGATCCCCTGArGGCCATGCTACTGAAACT ^50 

open reading frame 

OR TVKDS TPNSLOSO PLMAMLLKL 

CC AAGAAGC TGCCAAC TAC A 7TGAGTCACCAGATCGAG AG AC TATCCTGGACCCCAACCTCCAGGCGACA :32C 

open reading frame 

0 E A A N Y IESPDRET I LOPNLOAT 
CTCTGAGGGCCCGGCAGTCACTGTCACCCTGGAGGGCAGAAGGCTGGCTTCAGCATCATTAGCrCTCCTC 153C 



2F 



3' untranslated 



L . GPGSHCHPGGOKAGFS! ISSP 

TGCCCTCTTCCTTCATAGCTCTGGCTCACCAGCCTCGCCAAGAGAACAGGAGGGAAGAAGAGGGCAGGAG '950 

3' untranslated 

LPSSF I A L A H 0 P P C E NR-£EiGR-* 

GAGGGATGGGTTCTCGGTGC7GAACCTTTGAGAACTTCCTACTAGGAATTGGAGGGGGTGGAGTTTGAGA COy. 

_ 3' untranslated 

P D G F S V L -N L . E t P T R N » p G w $ L 

ACTCCGTGCCCCTTAACTiCATT'3CTGGCCTC:7CTrACGACTTAGGAGAAAAGArGAT"CTGGTCT^T : ' ! 

3' untranslated 

* P C P L TTFAGLLLRLRRKODSGL 

TCTTCAAGT TTTGTTTCACC T AC A A AC TC T TGGGC T TTC TGGGGAGGGAT TCGGAAGA Ta7 A A AC AG AC A ; " ?v 

3' untranslated 

F F K F C F r* <LLGFLGRDSCOINPO 
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AACAAAAACAAACAAACCAACTACAGCAGTTCCAAGCTCGTTCTCACAAACACCTC TGAGACAGTCACAf 22^0 

3' untranslated 

T K T N < P T7AVPSSFSOTPLS0SH 

GTGGGCAAATCTAAGGGAGGCAGGAAGCTC TACAGACTTTCTTGCAAACCCTTCCCAGT7CTG TCGAC AC 2210 
3* untranslated 

VGKS<GGRKLYRLSCKPFPVLST 

TGCCAACAACCTCCCCGCCAGAGACCfGGCCAGAGCCAAGAAAAGAGAAGCATGTGGTT'AACAGAAAAA 2280 

3* untranslated 

LPTT5PPE7VPEPRKEKHVV 0 K N 

CAAAACAAAACAAAACAAAAAATATATGTG TAAATCAACC TGTAGAAGGTAAAAACG3CAATGGAAAAGA 2^50 
3' untranslated 

KTKQNK<YMCKSTCRR K a 0 v K R 

TGAAGCTGGAAGGAGGGGCCCAGT73CCAAGATGGAACGAGAGCTGCCAGArCTTGCCTTCTGGATGACA 2520 
y untranslated 

SVKE GPSCOOGTftAARSC LLDO 

AGAGGGGACATTGCAAGATGGC7GCCAGTCTAAAACGTCACCAGACCACAAGAGTAACATCACAGCCT7C 2590 
3' untranslated 

KRGHCKMAASLK&HOTTRV 7 S 0 P S 

GAAGAAAGGCCACAAGC 7G7C7TTC 7GCCC TCTAACTGAAC ATGC ATGAAAAG 7CA A T AAACCC7AC f 77 2660 
3' untranslated ^ 

KKGHKLSFCPLTEHA K V N k P Y F 

n^ATTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAATTTCCGCGGCCGC 2709 

^ | polyA tail + linker \ 

LIFKKKKKKXKKFPRP 
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AAGCTTGGCACGAGGC CTCGTGCCAAGCTGAGACCGTCATGCAGCTCCGAAATGAGtrAA GAGACAAGGA 70 
I LINKER ? > open reading Irame 

AVHeASCQAETVMQLRNe LR0)<£ 

GATGAAGCTGACAGATATCCGCTTAGAAGCTCTCAGTTCTGC CCACCAGCTGGACCAGrTrrr ; ^^ ..-r . UQ 
open reading frame " " 

M LT 0,R 1-EALSSAH0L00LREA 

ATGAACAGGATGCAGACTGAAATAGAG AAGCTGAAAGCTGAGAATGArCGGCTGAAGrCAGAGTCTC AAn 2 !0 
open reading frame 

M N "moseieklkaendrlksesq 

GCAGrGGCTGCAGCCGGGCTCCTTCCCA AGTGTCCATCTCTGCCTCCCCGAGGCAG TCCATnanrn i 320 

open reading frame " 

G SGCS;,A;> S0VSISASPR0SMG L 5 

CCAGCACAGCTTGAACCTCACTGAGT CAACCAGCCrGGACArGTTGCTGGArfiACACTGGTGAATfi rr-.t 250 
open reading frame ' 

0 HSL ^ LT gSTSLDML LDOrGECS 

GCrCGGAAGGAAGGAGGCAGGCATGTTAAGATAGTTGTCAGCT T TCAGGAGGAAATGAAr.T^AA^ ,^ ;p 0 
— . open reading frame " 

A * KE jGRhvk i Vvsfqe Emkwk£: — 

ATT CCAGACCACAC CTCrTTCT7A TTGGCT6CATTGGAGTrAGT66CAAGACGAilGr flfl nAT«Ti=r Trr., eQO 
.. open reading frame " " — 

° S ft P H L ; ' S C i G V S G K T K v 0 V L 6 

TGGGGTGGTTAGACGGCTGrTCAAAGAAT A CATCArTCATGTCGA CCCAGTGAGTCAGrTAnr.nrr^.T C6 0 
m open reading frame " ' 

G V V " P L F K E ^ I 1 H V 0 P V S 0 L G L .V 

rCAGACAGCGTTCTrGGCr ACAG CATTGGAGAAATCAAGCGCAGCAACACTTCCGAAArArrr. n^rTr.- 620 
, open reading frame " 

S ° S V L G y S I G E I K R S N T S E T P E L 

TTCCTTGTGGCTA-TGG-GGAGAGAACACGACCATCTCAGTG ACTGTGAAAGGGCrr^AnAAA.r^ TC0 
— . open reading frame 

L ' C G V L V G E " H I S v T V K G L A E N S 

C C TGGAC TC AC TG 3 TG T TTG AG *GC TTGAT TCCCAAGCCC ATCC rGCAGCGCTACG TCTrCCTCn r.&rA 770 

open reading frame " " 

L 0 S L F E S L I P < P , L o p y v 3 L L . 

jAGCACCGTCGGATCATTCTCTCTGGCCCCAGCGGCACTGGGAAA ACCTACCTGGCCAACCGGrTr. Tr-, 8q0 
a open reading frame ~" — 

£ H ■ 1 L s c p s G r g K r v [ A N R [ s - 
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AGTATATAGTGCTTCGAGAGGGACGGGAGTTGACAGACGGGGTTATCGCCACCTTTAACGTGGACCArAA 910 

open reading frame 

£Y I VLAEGRELT-OGV I A T F N V D H .< 

GTCCAGCAAGGAATTGCGCCAGTACCTGTCCAACCTTGCTGACCAGTGCAACAGTGAGAACAATGC TGTG 980 

______ open reading trame 

SSKELROYLSNLAOOCNSENNAV 

G ACATGCCCCTCG7C ATCAfCCTGGACAACCTACACCACG TGAGC TCTCTGGGCGAGATCTTCAATGGGC :C50 
open reading frame 

3 M P L V I ILONLHHVSSLGE IFNG 

T GC TCAACTGCAAGTACCAC AAATGCCCTTACATAATTGGC AC AA rGAACCAGGCT ACCjlATC TCCCCTT *!20 

open reading frame J rT *<T ,u ~?c <*<.~ 

llnckyhkcpyi igtmnoatylpf 

ttatactaataatcttata aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa AATTC TGCGGCCGC ' :90 
open reading frame LINKER -vector *) 

Y T N N L I<KKKKKK<KKKK?<FCGR 
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I 1~ 50 L"NC53_hun\2 AAGC7TGGCA CGAGGCCTCC 7GCCAAGC7C AGACCGTCAT CCAGCTCCGA 



5i_I00 VNC53_hura 
? 49-. 98 



20 



30 



40 

JL 



A W H E A S C 0 A Z T V M ~ 0™L^R 



AATGAG7TAA GAGACAAGGA GATGAAGC7G ACAGATATCC GCTTAGAAGC 
N E L ROKE MKL T D I RLEA 



120 
-L 



140 



DO 



1D1-IS0 l*r55_honui7CTCAG7?C7 GCCCACCAGC 7GGACCAGCT CCGGGAGGCC ATGAACAGGA 
99-148 S S AHQ LDQL REA KNR 

1|0 IjO 1|0^ j|o 



L51...2DO UNC52_:"nW PCCAGAG1GA AA7AGAGAAG CTGAAAGC7G AGAATGATCG GCTGAAGTCA 
149...198 MQSE IEK LKA ENOR LKS 

5 j° 222 2J0 S|0 2j0 



2 C 1.-2 50 L*JC5 3_nujra 
199..24B 



2 S l.«300 yNC5 2_hu.Tw 
249.298 



GAG7CTCAAG GCAGTGGC7G GAGCCGGGC7 CCTTCCCAAG TGTCCATCTC 
ESQ GS 3C S R A ? S Q V S I S 



260 



T 



290 



3J0 



7GCC7CCCCG AGGCAG7CCA 7GGGCC7C7C CCAGCACAGC TTGAACCTCA 
A5P RQS MGLS QMS LNL 



301.-350 UNCS3_hu.T. 
299.-3 48 



310 



32C 

_L 



349-393 



4 CI-.-; 50 UNC5 J _huros 
39 9.-44 3 



IT 



340 



ill 



r.ajCTGACTCAAC CACCC7GGAC ATGTTGCTGG A TGACACTGG TGAATGCTCG 
TE5T SLO MLL ODTG ECS 



360 

JL 



390 



400 



GCTCCGAAGG AAGGAGGCAG CCA7CTTAAG A7AGTTCTCA GC 7TTCAGGA 
ARK E G C R H V K I V V S F Q E 



410 



J! 



430 



440 



4 5 1-5 00 UNC5 3_haj7-a 
449.498 



3 C 1.-5 50 3 _hvj:«] 

499.-548 



450 
1_ 



GG AAA TG AAG TGGAAGGAGG ATTCCAGACC ACACCTCTTT CTTATTGGCT 
ZMK W K E DSR? KLF L I Q 



460 



470 



480 



490 



SQ0 



CA77C3AGT 7AG7GGCAAG ACGAAGTGGG ATGTGCTCGA TGGGGTGGTT 
IGV SGK TKW DVLD GVV 



S10 

-L 



S|0 SjC 



330 



AGACCGCTGT TCAAAGAATA CATCA7TCA7 GTCGACCCAG TCAGTCAGCT 
R R I. F K E V I I H V D ? V S Q L 



551-.600 UNC53_^.^a 
549.-598 



559-648 



S60 
J- 



570 



600 
I 



AGCGCT3AAT TCAGACAGCG 7TC7TGGC7A CAGCATTGGA GAAATCAAGC 
GLN SDS V L G Y S I G E I K 



£10 

JL 



•520 
J- 



1° 



640 6|C 



CCAGCAACAC TTCCGAAACA CC5CAGC7GC 77CC7TGTGC CTATCTGGTT 
R S M 7 SET PEL LrCG Y L V 



660 
I 



671 



6« 



6?C 



7Q.C 



1 SSi-.^OO lc:C5.i_h'-jr.a | GGAGAGAACA CGAC7A7C7C AGTGACTvTG AAAGGGC77G CACAAAACAC 
£ 649...69B GEN T 7 Z c VTV KCL A E N S 
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1\2 720 7j0 7|0 7^0 


1 70L..750 'JWe53_hu* 
5 699.-748 


£ CC7GGACTCA CTGGTGTTTG AG7CCTTGA7 7CCCAAGCCC ATCCTGCAGC 
-DS 1 V F ESLI P K P I L 0 

7f5 770 1JO 790 800 


1 751-SOC "JKC33.hu.Tv 
> 749..79B 


aCCTACC-TCTC CCTCC7CA7A GAGCACCG7C GCATCATTCT CTCTGCCCCC 
R Y V S L L I EHR RIIL S G ? 

8jC 8^0 SjO 840 8^0 


L 501.-65C VNC53.hu.T» 
5 799-348 


3JAGCC-GCACTG GGAAAACCTA CC7GGCCAAC CGGCTG7C7C AC7ATATAGT 
SGT G K 7 Y LAM RLS EYIV 

S60 e-jo s^: %90 900 


1 551-900 'JKCS.'Juura 
5 849-898 


jGCTTCGAGAG GGACGGGAGT TGACAGACCG GGTTATCGCC ACCTTTAACG 
-RE G R E LTDG VIA 7 F N 

91? 9|0 9|C 940 9J0 


a ?01-950 'JXC= IJr.urii 
S 8 99.-9 C 8 


r jCACCATAA G7CCAGCAAG gaattgcgcc agtacctgtc caaccttgct 

V 2 H K SSK E L R Q Y L S NLA 

9€0 970 930 950 1C00 


S 949...99S 


GACCAG75CA ACAG7GAGAA CAA7GCTG7G GACA7GCCCC 7CS7CA7CAT 
0 Q C NSEN N A V DM? L V Z I 

lOiC 1320 1030 1040 1050 

i L. 1 1 f 


I 1001-IC50 V>7C 1 2 h- 
5 999-1048 


CCTC-GACAAC CTACACCACG 7GASC7CTC7 GGGCGAGATC TTCAATGGGC 
L D N LHH VSSL GEX FNG 

1C6cP^1^£^ iCTC 1060 '090 HOC 


1 1051...11CC y?:C52_h;j 
5 1C49.-1C93 


TvC7CAAC7G CAAGTACCAC AAA7GCCC77 ACA7AATTGG CACAATGAAC 
-INC K YH KC? YIIG T M N 

1110 11SC 1130 1140 HSC 
ll 1 1 f 1 


5 :o99-.ii<a 


CAGGCTACCir CTTCGACTCC CAACCTGCA3 C7TCACCATA ACTTCAGATG 
CATISST? K L Q _ L K H NFRW 

1140 1170 1190 1200 

1 1 t 1 1 


1 11S1...I2CC 'J>:C 5 3 rv. 
5 U49-.X198 


CCTCC7TTG7 GCCAACCACA CCGAGCC7G7 GAAGGGT7TC C77GCCCGA7 
V L C A N H TEPV KG? LCR 

121? 1220 1230 1240 1250 


i ;2c;...:2S-: vkcsj h. 


rCCTCAGGAG GAAGCTCATG GAAACAGA-^A 7CAG7CGGCG GG7GCGCAAT 
FLRR XIX E 7 E I3GR V R 11 

1260 127? 125J 1290 1300 
1 1 1 • 1 


p. :25i..;ico .VNCSj.h. 

? 1249.-1258 


nTCGA:;C73G TAAAAATCA7 TGAC7GCA7T CCCAAGG7CT GGCATCACC7 

y. z i. vxii dw: pkv whhl 

121? 122C 133C 13^40 13S0 


i 13C1...125C VKCS? nJ 
S :239...1343 


:^.CC3CTTC C7GGAGGC7C ACAGTTCC7C LVGACGTCACC A7CCGCCCCC 
N R F LEA HSSS D VT I C P 

12oO 1370 13 80 13^90 140C 


i 12^ v::c33.:-.. 

S 1349.. 135* 


;-octc*:tcct -;-catgcccc atcgatctgg acggctccag agigtccttc 
* 1 7 l scp i d v dgsr vwf 
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X V° ; v° ; V° ; V° : V°" 


L 1401_I450 UNC5J i* 
5 1399...1448 


iu ACCGACTTGT GGAACTATTC CATTATCCCC TATCTCCTCG AACCCG7CAG 
T 0 L WNYS IIP YLL E A V R 

r 4 , 60 :i i* c isoo 
1 — 1 ! i 1 


L 1451^15:0 'JNCS2 h 
5 1449.-1498 


V AGAAGCAC7C CAGCTCTA7C GAAGGCCCGC CCCCTSGGAG GATCCTGCCA 
SOL QLV G R » A PWE DPA 

1513 1520 1530 1540 1530 


1 1531-1550 UNCS3 h 
3 1499-1542 


l AG7GGG7CAT GGACACATAr CCATGGGCAG CCAGCCCACA ACACCACCAG 
KWVM DTY P W A AS?Q Q H E 

1360 1S73 1530 1S90 1600 


i 1551.-16CO -JNC52 h 
5 1549...1595 


- 7CGCC7CCCC TGC7GCAGT7 ACGGCCTGAG GATGTCGGCT 7CGACGGCTA 
WPP 1< L Q L R P E D V S FDGY 

i6 , i0 l6 , 20 :e ( JC 1630 


I 1S01-.16SG raC53.h; 
5 1599.-1646 


wTCCATGCCT CGGGAGGGAT CGACAAGCAA GCASATCCCC CCCA37CATC 
3 X P RSG STSK CK? PSD 

16,60 1670 1*80 1690 I?0C 

1 1 1 I L_ 


I 1651-.170C -JNCS3_hL 
5 1649...169S 


i» lvAAGG73A CCCGCTGATG AACATGC7GA TGAGGCTGCA GGAGGCAGCC 
A2CD PLM NML MRLQ E A A 

1713 1720 1730 1740 1750 
! iff | 


I 1701^.17 = 0 UNC53_hd 
5 1699-1745 


^CTACTCCA GCCCCCAGAG CTATGACAGC CACTCCAACA GCAACAGCCA 
N Y S S P O S Y O S CSX S N S H 

1760 1770 i?8C 1790 1800 

i i ' • i t 


• 17S1..1SC0 VNC5 3_h\. 
1 1749-.I790 


7CACGATGAC ATCTTGGACT CC7CTTTGGA GTCCACTCT3 TG^JCAGGGGC 
HDD I L O S 5 L E STL ■ J 

1B10 1820 18JC 1840 | 1850 


1 1801...185C ^C;; hu 
5 


CCGGAGCCCA GCGCCCTCCT CTTCTCCTCA CCGCATTCCA CCTGCATCCC 

OS 

1360 18*0 1880 :9 ( 90 1900 


I 1651^1900 UN*CS3_ht 
> ( 


CACA7CACCC TCAAGATGAC T7CC7GACCC AGCCCCAGCC ACAGCCTTAG 

< ss 

is , 2 ° :5 , 3C : V° 1950 


I 19G1...1941 VNC5:_hw 
5 • — - 


nO>Z7CZZ:iZA ACACCGAGAC CCCCCTCCT7 CAGCCTCCIP.C T 
<^ — - 
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EST 46037 



3' RACE (placenta) 
— polyA EST 923793 



5' RACE (placenta) cl 1.4 
5' RACE (placenta) cl 3.7 



GTCAATCAG 

- PCR(HeLa) EST 485068- hh2 E 2.3 



M 
M 
M 
M 



— — PCR (placenta) EST 485068- hh2 C 2.3 
PCR (HeLa) EST0l222-hh2 E 1.3-3 

— 5' RACE (placenta) B2. 1 
— 5' RACE (HeLa) D2.1 

— 5* RACE (adenocarcinoma SW480) H2.1 

^ 5* RACE (placenta) A2.2-2 

— — 5' RACE (HeLa) B2. 1 -4 

^ 5' RACE (adenocarcinoma S W480) D2. 1 -5 

5* RACE (adenocarcinoma SW480) D4.I-I 



' 5* RACE (melanoma G36DJ4. 1-4 
5' RACE (HeLa) G4. 1.1 

5* RACE (placenta) F4. 1.2 



Figure 1 1c 
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EST *j'^T*}?>h 



Page 1 



V T I 



♦lGTRVfTIG PRL F L S C PID VDG SRVW FTD 
1 GGCACGAGGG TTfVCCATCGG CCCCCGGCTC TTCCTGTCAT GCCCCATCGA TGTGGACGGC TCGAGAGTGT GGTTCACCGA 
AATGGTAGCC GGGGGCCGAG AAGGACAGTA CGGGGTAGCT ACACCTGCCG AGCTCTCACA CCAAGTGGCT 



+ 1 LWN YSII P Y L LEA VREG LQL YGR RAP 
81 CTTGTGGAAC TATTCCATTA TCCCCTATCT CCTGGAAGCC GTCAGAGAAG GACTCCAGCT CTATGGAAGG CGCGCCCCCT 
GAACACCTTG ATAAGGTAAT AGGGGATAGA GGACCTTCGG CAGTCTCTTC CTGAGGTCGA GATACCTTCC GCGCGGGGGA 

+ 1WEDP AKW VMDT YPW A A S PQQH EWP PLL 

NCOI 



161 GGGAGGATCC TGCCAAGTGG GTGATGGACA CATATCCATG GGCAGCCAGC CCACAACAGC ACGAGTGGCC TCCCCTGCTG 
CCCTCCTAGG ACGGTTCACC CACTACCTGT GTATAGGTAC CCGTCGGTCG GGTGTTGTCG TGCTCACCGG AGGGGACGAC 

+ 1QLRP EDV GFD GYSM PRE GST SKQM PPS 
241 CAGTTACGGC CTGAGGATGT CGGCTTCGAC GGCTACTCCA TGCCTCGGGA GGGATCGACA AGCAAGCAGA TGCCCCCCAG 
GTCAATGCCG GACTCCTACA GCCGAAGCTG CCGATGAGGT ACGGAGCCCT CCCTAGCTGT TCGTTCGTCT ACGGGGGGTC 

+ 1 DAE GDPL MNM LMR LQEA ANY SSP QSY 
321 TGATGCTGAA GGTGACCCGC TGATGAACAT GCTGATGAGG CTGCAGGAGG CAGCCAACTA CTCCAGCCCC CAGAGCTATG 
ACTACGACTT CCACTGGGCG ACTACTTGTA CGACTACTCC GACGTCCTCC GTCGGTTGAT GAGGTCGGGG GTCTCGATAC 

+ 1DSDS NSN SHHE DIL DSS LEST L * Q GPG 
401 ACAGCGACTC CAACAGCAAC AGCCATCACG AAGACATCTT GGACTCCTCT TTGGAGTCCA CTCTGTGACA GGGGCCCGGA 
TGTCGCTGAG GTTGTCGTTG TCGGTAGTGC TTCTGTAGAA CCTGAGGAGA AACCTCAGGT GAGACACTGT CCCCGGGCCT 

+1 A Q R P PLL LTA FHLH PPH HPE DDFL SQP 
481 GCCCAGCGCC CTCCTCTTCT CCTCACCGCA TTCCACCTGC ATCCCCCACA TCACCCTGAA GATGACTTCC TGAGCCAGCC 
CGGGTCGCGG GAGGAGAAGA GGAGTGGCGT AAGGTGGACG TAGGGGGTGT AGTGGGACTT CTACTGAAGG ACTCGGTCGG 

+ 1 PAT ALEL REH RDP PSFS LDL GAG IPG 
561 CCCAGCCACA GCCTTAGAGC TGCGGGAACA CCGAGACCCC CCGTCCTTCA GCCTCGACCT GGGTGCAGGC ATCCCGGGCC 
GGGTCGGTGT CGGAATCTCG ACGCCCTTGT GGCTCTGGGG GGCAGGAAGT CGGAGCTGGA CCCACGTCCG TAGGGCCCGG 

♦1 Q L P A DRF LPQR ELH YLL LYFN YCF ALL 
641 AGCTGCCTGC GGACCGCTTC CTTCCACAGC GAGAACTGCA CTACCTTCTG TTGTACTTTA ATTATTGTTT TGCCTTGTTG 
TCGACGGACG CCTGGCGAAG GAAGGTGTCG CTCTTGACGT GATGGAAGAC AACATGAAAT TAATAACAAA ACGGAACAAC 



♦1 L * P P * D T EDT SRER I I A VEM KKKK KKK 
721 CTGTGACCTC CCTAAGACAC TGAAGATACT TCTCGGGAAA GGATCATCGC CGTTGAAATG AAAAAAAAAA AAAAAA AAAA 
GACACTGGAG GGATTCTGTG ACTTCTATGA AGAGCCCTTT CCTAGTAGCG GCAACTTTAC TTTTTTTTTT TTTTTTTTTT 



+1 KKK 
801 AAAAAAAAAA 
TTTTTTTTTT 




N E G G R K L 
lCG AAGGCGGCCG CAAGCTT 
TTCCGCCGGC GTTCGAA 
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vlpHL DRN TLP 



+ 1 I P L T IGL E R P PRQVlPHL DRN TLPK KGL 
1 ATCCCACTCA CTATAGGGCT CGAGCGGCCG CCCAGGCAGG TCfcCGCACCT TGATAGGAAC ACTTTGCCTA AGAAAGGACT 
TAGGGTGAGT GATATCCCGA GCTCGCCGGC GGGTCCGTCC AGGGCGTGGA ACTATCCTTG TGAAACGGAT TCTTTCCTGA 

+ 1 RYT PTSQ LRT QED AKEW LRS H S A GGL 
81 CAGGTATACT CCCACCTCCC AGCTTCGCAC GCAAGAAGAT GCAAAAGAAT GGTTACGGTC CCATTCTGCA GGAGGCCTTC 
GTCCATATGA GGGTGGAGGG TCGAAGCGTG CGTTCTTCTA CGTTTTCTTA CCAATGCCAG GGTAAGACGT CCTCCGGAAG 

♦ 1 Q D T A A N S PFSS GSS VTS PSGT R F N FSO 
161 AGGACACCGC TGCCAATTCC CCCTTTTCCT CTGGCTCCAG CGTGACTTCT CCCTCCGGAA CAAGATTCAA CTTTTCCCAG 
TCCTGTGGCG ACGGTTAAGG GGGAAAAGGA GACCGAGGTC GCACTGAAGA GGGAGGCCTT GTTCTAAGTT GAAAAGGGTC 

+ 1 L A S P TTV TQM SLSN PTM LRT HSLS NAD 
241 CTTGCGAGTC CCACCACTGT CACCCAGATG AGCTTGTCCA ACCCGACCAT GCTGAGGACT CACAGCCTCT CCAATGCTGA 
GAACGCTCAG GGTGGTGACA GTGGGTCTAC TCGAACAGGT TGGGCTGGTA CGACTCCTGA GTGTCGGAGA GGTTACGACT 

+ 1 G Q Y DPYT DSR FRN SSMS L D E KSR TMS 
321 TGGGCAGTAT GATCCATACA CTGACAGCCG CTTCCGGAAT AGCTCCATGT CCCTGGATGA GAAGAGCAGA ACCATGAGCC 
ACCCGTCATA CTAGGTATGT GACTGTCGGC GAAGGCCTTA TCGAGGTACA GGGACCTACT CTTCTCGTCT TGGTACTCGG 

+1 R S G S FRD GFEE VHG SSL SLVS STL SVY 
401 GTTCAGGCTC ATTCCGGGAT GGGTTTGAAG AAGTTCATGG ATCCTCACTC TCCCTGGTTT CCAGCACATT GTCAGTTTAT 
CAAGTCCGAG TAAGGCCCTA CCCAAACTTC TTCAAGTACC TAGGAGTGAG AGGGACCAAA GGTCGTGTAA CAGTCAAATA 

♦1STPE EKC QSE IRKL RRE LDA SQEK VSA 
481 TCTACACCAG AAG AAAA ATG CCAGTCAGAG ATTCGCAAGC TGCGGCGGGA ACTGGATGCC TCCCAGGAGA AAGTTTCAGC 
AGATGTGGTC TTCTTTTTAC GGTCAGTCTC TAAGCGTTCG ACGCCGCCCT TGACCTACGG AGGGTCCTCT TTCAAAGTCG 

+ 1 LTT QLTA NAH LVA AFEQ SLG NMT IRL 
561 TTTGACCACC CAGCTGACAG CAAATGCTCA CCTTGTGGCT GCCTTTGAAC AGAGTCTTGG TAACATGACA ATCAGGCTCC 
AAACTGGTGG GTCGACTGTC GTTTACGAGT GGAACACCGA CGGAAACTTG TCTCAGAACC ATTGTACTGT TAGTCCGAGG 

+ 1QSLT MTA EQKD SEL N E L RKTI ELL KKQ 
641 AGAGTCTGAC CATGACAGCT GAGCAGAAGG ATTCAGAACT GAATGAGTTA AGAAAAACCA TTGAGCTGCT AAAGAAACAG 
TCTCAGACTG GTACTGTCGA CTCGTCTTCC TAAGTCTTGA CTTACTCAAT TCTTTTTGGT AACTCGACGA TTTCTTTGTC 

+1 N A A A QAA ING VINT PEL NCK GNGT AQS 
721 AACGCAGCTG CCCAGGCTGC CATTAATGGA GTAATTAACA CACCTGAGCT CAACTGCAAA GGAAACGGCA CTGCCCAGTC 
TTGCGTCGAC GGGTCCGACG GTAATTACCT CATTAATTGT GTGGACTCGA GTTGACGTTT CCTTTGCCGT GACGGGTCAG 

+ 1 ADL RIRR Q H S SDS VSSI NSA TSH SSV 
801 TGCAGACCTC CGCATCCGCA GGCAGCACTC CTCAGACAGC GTCTCCAGCA TCAACAGTGC CACCAGCCAC TCCAGTGTGG 
ACGTCTGGAG GCGTAGGCGT CCGTCGTGAG GAGTCTGTCG CAGAGGTCGT AGTTGTCACG GTGGTCGGTG AGGTCACACC 

♦1 G S N I ESD SKKK KRK NWL RSSF KQA FGK 
881 GCAGCAACAT AGAGAGTGAC TCAAAGAAGA AGAAGAGGAA GAACTGGTTA CGCAGCTCCT TCAAGCAAGC TTTCGGGAAG 
CGTCGTTGTA TCTCTCACTG AGTTTCTTCT TCTTCTCCTT CTTGACCAAT GCGTCGAGGA AGTTCGTTCG AAAGCCCTTC 

+ 1 K K S P KSA SSH SDIE ETT DSS LPSS PKL 
961 AAGAAGTCCC CAAAATCTGC GTCCTCTCAT TC AG AT ATTG AGGAGACGAC GGATTCTTCT TTGCCTTCCT CACCAAAGTT 
TTCTTCAGGG GTTTTAGACG CAGGAGAGTA AGTCTATAAC TCCTCTGCTG CCTAAGAAGA AACGGAAGGA GTGGTTTCAA 

+ l PHN GSTG STP LLR NSHS NSL ISE CMD 
1041 ACCGCACAAT GGGTCCACAG GTTCCACCCC ACTGCTGAGG AATTCTCACT CCAACTCTCT AATTTCCGAA TGCATGGATA 
TGGCGTGTTA CCCAGGTGTC CAAGGTGGGG TGACGACTCC TTAAGAGTGA GGTTGAGAGA TTAAAGGCTT ACGTACCTAT 

+ 1SEAE TVM QLRN ELR DKE MKLT D I R I 
1121 GTGAAGCTGA GACCGTCATG CAGCTCCGAA ATGAGTTAAG AGACAAGGAG ATGAAGCTGA CGGATATCCG Cj 
CACTTCGACT CTCGCAGTAC GTCGAGGCTT TACTCAATTC TCTGTTCCTC TACTTCGACT GCCTATAGGC G 
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♦2 E F E L G T L TIG LER PPGQ 
1 CGAATTCGAG CTCGGTACAC TCACTATAGG GCTCGAGCGG CCGCCCGGGC 
GCTTAAGCTC GAGCCATGTG AGTGATATCC CGAGCTCGCC 



D G F E E V H 
AGGTCjCGGGA TGGGTTTGAA GAAGTTCATG 
TCCAGGCCCT ACCCAAACTT CTTCAAGTAC 



+2 G S S L SLV SSTS SVY STP EEKC QSE IRK 
81 GATCCTCACT CTCCTTGGTT TCCAGCACAT CGTCAGTTTA TTCTACACCA GAAGAAAAAT GCCAGTCAGA GATTCGCAAG 
CTAGGAGTGA GAGGAACCAA AGGTCGTGTA GCAGTCAAAT AAGATGTGGT CTTCTTTTTA CGGTCAGTCT CTAAGCGTTC 

+ 2LRRE LDA SQE KVSA LTT QLT ANAH L V A 
161 CTGCGGCGGG AACTGGATGC CTCCCAGGAG AAAGTTTCAG CTTTGACCAC CCAGCTGACA GCAAATGCTC ACCTTGTGGC 
GACGCCGCCC TTGACCTACG GAGGGTCCTC TTTCAAAGTC GAAACTGGTG GGTCGACTGT CGTTTACGAG TGGAACACCG 

+2 AFE QSLG NMT IRL QSLT MTA EQK DSE 
241 AGCCTTTGAA CAGAGTCTTG GTAACATGAC AATCAGGCTC CAGAGTCTGA CCATGACAGC TGAGCAGAAG GACTCAGAAC 
TCGGAAACTT GTCTCAGAAC CATTGTACTG TTAGTCCGAG GTCTCAGACT GGTACTGTCG ACTCGTCTTC CTGAGTCTTG 

+2 L N E L RKT IELL KKQ N A A A Q A A ING VIM 
321 TGAATGAGTT AAGAAAAACC ATTGAGCTGC TAAAGAAACA GAACGCAGCT GCCCAGGCTG CCATTAATGG AGTAATTAAC 
ACTTACTCAA TTCTTTTTGG TAACTCGACG ATTTCTTTGT CTTGCGTCGA CGGGTCCGAC GGTAATTACC TCATTAATTG 

♦2 T P E L NCK GNG TAQS ADL RIR RQHS SDS 
401 ACACCTGAGC TCAACTGCAA AGGAAACGGC ACTGCCCAGT CTGCAGACCT CCGCATCCGC AGGCAGCACT CCTCAGACAG 
TGTGGACTCG AGTTGACGTT TCCTTTGCCG TGACGGGTCA GACGTCTGGA GGCGTAGGCG TCCGTCGTGA GGAGTCTGTC 

*2 VSS INSA TSH SSV GSNI BSD SKK KKR 
481 CGTCTCCAGC ATCAACAGTG CCACCAGCCA CTCCAGCGTG GGCAGCAACA TAGAGAGTGA CTCAAAGAAG AAGAAGCGGA 
GCAGAGGTCG TAGTTGTCAC GGTGGTCGGT GAGGTCGCAC CCGTCGTTGT ATCTCTCACT GAGTTTCTTC TTCTTCGCCT 

+2 K N W V MEL RSSF K Q A FGK KKSP KSA SSH 
561 AGAACTGGGT CAATGAGTTA CGCAGCTCCT TCAAGCAAGC TTTCGGGAAG AAGAAGTCCC CAAAATCTGC GTCCTCTCAT 
TCTTGACCCA GTTACTCAAT GCGTCGAGGA AGTTCGTTCG AAAGCCCTTC TTCTTCAGGG GTTTTAGACG CAGGAGAGTA 

+2 S D I E EMT DSS LPSS PKL PHN GSTG STP 
641 TCAGATATTG AGGAGATGAC GGATTCTTCT TTGCCTTCCT CACCAAAGTT ACCGCACAAT GGGTCCACAG GTTCCACCCC 
AGTCTATAAC TCCTCTACTG CCTAAGAAGA AACGGAAGGA GTGGTTTCAA TGGCGTGTTA CCCAGGTGTC CAAGGTGGGG 



+ 2 LLR NSHS NSL ISE CMOS EAE TVM QLR 
721 ACTGCTGAGG AATTCTCACT CCAACTCTCT AATTTCAGAA TGCATGGATA GTGAAGCTGA GACCGTCATG CAGCTCCGAA 
TGACGACTCC TTAAGAGTGA GGTTGAGAGA TTAAAGTCTT ACGTACCTAT CACTTCGACT CTGGCAGTAC GTCGAGGCTT 



>2NELR DKE MKLT D 
801 ATGAGTTAAG AGACAAGGAG ATGAAGCTGA CGGAlivT 
TACTCAATTC TCTGTTCCTC TACTTCGACT GCCTATA 



\'mT 
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Figure 12. Tblastn search of the EST division of Genbank with 
680aa starting at the c-terminus of the alfa-actinin 
domain of Hu-UNC-53/2. 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



LOCUS AA418158 610 bp mRNA EST 19-MAY-1997 

DEFINITION zv97dl2.rl Soares NhHMPu SI Homo sapiens cONA clone 767735 5'. 
ACCESSION AA418158 
NID g2079968 
KEYWORDS EST. 
SOURCE human. 

ORGANISM Homo sapiens 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 

Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 

Homo. 

1 (bases 1 to 610) 

Hillier,L., Allen, M., Bowles, L., Dubuque, T., Geisel,G., Jost,S., 
Kucaba,T., Lacy,M., Le , N . , Lennon,G., Marra,M., Martin, J., 
Moore, B., Schellenberg,K . , Steptoe,M., Tan,F., Theising,B., 
White, Y., Wylie,T., Waterston,R. and Wilson, R. 
WashU-Merck EST Project 1997 
Unpublished (1997) 

Contact: Wilson RK 
WashU-Merck EST Project 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: eat8watson.wustl.edu 

This clone is available royalty-free through LLNL ? contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Seq primer: -2 8ml 3 rev2 ET from Amersham 
High quality sequence stop: 492. 
FEATURES Location/Qualifiers 
source 1..610 

/organism 53 "Homo sapiens" 

/note=" Organ: mixed (see below); Vector: pT7T3D-Pac 
(Pharmacia) with a modified polylinker; Site_l : Not I; 
Site__2: Eco RI; Equal amounts of plasmid DNA from three 
normalized libraries (melanocyte 2NbHM , pregnant uterus 
NbHPU, and fetal heart NbHH19W) were mixed, and ss circles 
were made in vitro. Following HAP purification, this DNA 
was used as tracer in a subtract! ve hybridization 
reaction. The driver was PCR-amplif ied cDNAs from pools of 
5,000 clones made from the same 3 libraries. The pools 
consisted of I.M.A.G.E. clones 260232-265223, 
340488-345479, and 484488-489479." 
/clone="767735" 
/clone_lib=" Soares NhHMPu SI" 

/tiasue_type=" Pooled human melanocyte, fetal heart, and 
pregnant uterus" 
/lab_host»"DH10B" 
mRNA <1..>610 
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mRNA 
BASE COUNT 
ORIGIN 

1 



108 



/clones "5D16 M 

/clone_lib- tt Zebrafish ICRFzfls" 
/sex= "mixed" 

/tissue_type="pooled 26-somite embryos" 
/lab_host="XLl-blue MRF " 
complement (<1 . ,>418) 
a 87 c 78 g 145 t 



// 



tttacatttt ttgaggaaga tgctaatggt ctattctgat tcaatgattt atgctaagct 

61 aagctaaaat gctcctgtca aatcctgaga tcagctgaat gaattaaaaa tttggtaaaa 

121 ctcaactgtc taactctagg ggagttgtaa aatgggccta tttccctaaa aagtaatgtt 

181 actttaagag catgatggtc caccagtttc actgtctaaa ttttgttatt ccataagcta 

241 atcttctctg ggcattttga cgattttaac actaacctgt gggtaatctg cgtcccccgt 

301 aaactggaca tggtttcttc cagattctgt ctcagatcag caatgttctt cactgtacgc 

361 atccgtctag tttctggatc ttctcctgag atctcctcca ggcactgttt ggcggtct 



gb|AA495042|AA495042 £a05f06.sl Zebrafish ICRFzfls Danio rerio cDNA 
clone 5D16 3' 
Length « 418 

Minus strand asPs: 

Score = 195 (87.9 bits), Expect = 9.9e-18, P = 9.9e-18 
Identities = 37/46 (80%), Positives « 42/46 (91%), Frame - -3 

Query: 627 TGQPALEELTGEDPEAKRLRTVKNIADLRQNLEETMSSLRGTQVTH 672 

T + LEE++GEDPE RR+RTVKN I ADLRQNLEE TMS SLRGTQ+TH 
Sbjct: 416 TAKQCUSE I S GE DPE TRRMRTVKN I ADLRQNLEE TM S SLRGT QI TH 279 

MOUSE 2 



LOCUS 

DEFINITION 

ACCESSION 

NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



AA208994 527 bp mRNA EST 18-FEB-1997 

mw75el2.rl Soares mouse NML Mus musculus cDNA clone 676558 5* 

AA208994 

gl807004 

EST. 

house mouse. 
Mus musculus 

Eukaryotae? mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae? 
Mus . 

1 (bases 1 to 527) 

Marra,M., Hillier,L., Allen, M., Bowles, M., Dietrich, N., Dubuque, T. 
Geisel,s., Kucaba,T., Lacy,M., Le,M. , Martin, J., Morris, M., 
Schellenberg,K. , Steptoe,M., Tan,F., Underwood , K . , Moore, B., 
Theising,B., Wylie,T., Lennon,G., Soares, B. f Wilson, R. and 
Waterston,R. 

The WashU-HHMI Mouse EST Project 
Unpublished (1996) 

Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of MedicineP 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 
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Tel: 314 286 1800 
Fax: 314 286 1810 
Email: mou3eeat6wat3on.wustl.edu 

This clone is available royalty- free through LLNL ; contact the 
IMAGE Consortium (info6image.llnl.gov) for further information. 
MGI:416262 

Putative full length read 
vector to vector length is 535 
Seq primer: -2 8ml 3 rev2 ET from Amersham 
High quality sequence stop: 478. 
FEATURES Location/Qualifiers 
source 1..527 

/organism» M Mus musculus" 

/ no tea "Vector: pT7T3D-Pac (Pharmacia) with a modified 
polylinker; Site_l : Not I; Site_2 : Eco RI; 1st strand cDNA 
was primed with a Not I - oligo(dT) primer [5* 
TGTTACCAATCTGAAGTGGGAGCGGCCGCGAATCTTTTTTTTTTTTTTTTTTT 3 ' J ; 
double- stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not I 
and Eco RI sites of the modified pT7T3 vector. Library 
constructed and normalized by Bento Soares and M.Fatima 
Bonaldo . " 
/clone- w 676558" 
/clone_lib= w Soares mouse NML" 
/tissue^type*'' Liver" 
/ 1 ab__hos t« " DH 1 OB " 
mRNA <1..>527 

BASE COUNT 151 a 139 c 136 g 101 t 

ORIGIN 

1 tgtctctgga tgagaagagc cgaacaatga gtcggtcagg ctccttccgg gatgggtttg 
61 aggaagttca tggatcctcc ctgtccttgg tttccagcac atcctccatc tactccacgc 
121 cagaagaaaa atgccagtca gagattcgaa agctgaggcg agacgtggat gcctcccagg 
181 aaaaggtgtc tgcgctgact acccagctga ctgcaaatgc tcaccttgtg gcagccttcg 
241 agcagagtct gggaaacatg accatcaggc tacagagttt aactatgacc gctgagcaga 
301 aggattcaga actgaacgag ttaagaaaaa ccatcgagct gctgaagaaa cagaatgcag 
361 ctgcccaggc tgccattaat ggagtgatta acacgccaga gctcaactgc aaaggaaatg 
421 gcagtgccag gctacagacc tacgcatccg cagcaacact cctccgacag tgtctccagt 
481 atcaatagcg ccaccagcca ctcaagtgtg ggcagcaaca tagagag 

gb|AA208994|AA208994 mw75el2.rl Soares mouse NHL Mus musculus cDNA 
clone 676558 5* 
Length = 527 

Plus Strand HSPs: 

Score = 541 (243.9 bits), Expect = 2.3e-76, Sum P(2) = 2.3e-76 
Identities = 110/143 (76%), Positives = 114/143 (79%), Frame ° +3 

Query: 1511 SLDEKSRTMSRSGSFRDGFEEVHGXXXXXXXXXXXXXXXPEEKCQSEIRKLRRELDASQE 1570 

SLDEKSRTMSRSGSFRDGFEEVHG PEEKCQSE IRKLRR++DASQE 

Sbjct: 3 SLDEKSRTMSRSGSFRDGFEEVHGSSLSLVSSTSSIYSTPEEKCQSE IRKLRRDVDASQE 182 

Query: 1571 KVSALTTQLTANAHLVAAFEQSLGNMTIRLQSLTMTAEQKDSELNELRKTIEXXXXXXXX 1630 



WO 98/24810 



191/270 



PCI7EP97/06956 



KVSALTTQLTANAHLVAAFEQSLGNMTIRI.QSLTMTAEQKDSELNELRKTIE 
Sbjct: 183 KVSALTTQLTANAHLVAAFEQSLGNMTIRLQSLTMTAEQKDSELNEI*RKTIEI#LKKQNAA 362 

Query: 1631 XXXXXXGVINTPELNCKGNGTAQ 1653 

GVINTPELNCKGNG+A+ 
Sbjct: 363 AQAAINGVINTPELNCKGNGSAR 431 

Score = 116 (52.3 bits), Expect » 2.3e-76, Sum P(2) = 2.3e-76 
Identities = 24/25 (96%), Positives = 25/25 (100%), Frame - +1 

Query: 1661 RQHSSDSVSSINSATSHSSVGSNIE 1685 

+QHSSDSVSSINSATSHSSVGSNIE 
Sbjct: 451 QQHSSDSVSSINSATSHSSVGSNIE 525 



LOCUS 
DEFINITION 

ACCESSION 
NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



AA049124 337 bp mRNA EST 09-SEP-1996 

mj46f04.rl Soares mouse embryo NbMEl3.5 14.5 Mus musculus cDNA 

clone 479167 5' . 

AA049124 

gl528794 

EST. 

house mouse . 
Mus musculus 

Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata? Eutheria; Rodent ia; Sciurognathi; Muridae; Murinae; 
MUS . 

1 (bases 1 to 337) 

Marra,M., Hillier,L., Allen, M., Bowles, M. , Dietrich, N., Dubuque, T. 
Geisel,s., Kucaba,T., Lacy,M., Le , M . , Martin, J., Morris, M., 
Schellenberg,K., Steptoe,M., Tan,F., Underwood , K . , Moore, B., 
Theising,B., Wylie,T., Lennon,G., Soares, B., Wilson, R. and 
Wateraton,R. 

The WashU-HHMI Mouse EST Project 
Unpublished (1996) 

Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of Medicinep 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: mouseest@watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info0image.llnl.gov) for further information. 
MGI:289911 

Seq primer: -28M13 rev2 from Amersham 
High quality sequence stop: 292. 

Location/Qualifiers 

1 . .337 

/organism= "Mus musculus" 
/strain= "C57BL/6J" 

/note=" Vector: pT7T3D-Pac (Pharmacia) with 
poly linker; Site_l : Not I; Site_2 : Eco RI; 



was primed with a Not I - oligo(dT) primer [5 



modified 
1st strand cDNA 
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TGTTACCAATCTGAAGTGGGAGCGGCCGCGGAAATTTTTTTTTTTTTTTTTTTTTTTT 
T 3'), on equal amounts of mRNA from 2 13.5dpc and 2 
14.5dpc embryos (total RNA provided by Minoru Ko, Wayne 
State Univ., from 2 J; double-stranded cDNA was ligated to 
Eco RI adaptors (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of the modified 
pT7T3 vector. Library went through one round of 
normalization, and was constructed by Bento Soares and 
M . Fatima Bonaldo . " 
/clone>= M 479167" 

/clone_lib«* M Soares mouse embryo NbME13.5 14.5 M 
/ sex« " unknown " 
/tissue_type= "embryo** 

/dev_stage=" 13 .5-14 . 5dpc total fetus" 

/ 1 ab_hos t= "DH10B ■ 
mRNA <1..>337 
BASE COUNT 80 a 101 c 97 g 59 t 

ORIGIN 

1 catcctctgt gggcaccgag gtcaccgaga cccctgctca ttcagtcccc cacactagac 
61 tgttccaagc caatgaagag gaggagccag agaagaagga ggtatcagaa ctgcgctctg 
121 aactatggga aaaagagatg aagctcacgg atatccggtt ggaggccctc aactctgccc 
181 accagctgga ccagcttcgg gagaccatgc acaatatgca gttggaggtg gacctgctga 
241 aagcagagaa tgaccggctg aaggttgccc ccgggccctc ctcaggctgc actccagggc 
301 aggtccctgg gtcatcggct ctgtcgtccc ctcgacg 



gb|AA049124|AA049124 mj46f04.rl Soares mouse embryo NbME13.5 14.5 Mus 
musculus cDNA clone 479167 5' 
Length » 337 

Plus Strand HSPs: 

Score * 206 (92.9 bits), Expect = 3.9e-19, P - 3.9e-19 
Identities = 42/60 (70%), Positives - 51/60 (85%), Frame - +3 

Query: 1760 DSEAETTOQI^NEI^DI^MKLTDIRI^ALSSAHQLDQLREAMNRMQSEIEKLKAENDR^ 1819 

+ E + V +LR+EL +KEMKLTDIRLEAL+SAHQLDQLRE M+ MQ E++ LKAENDRLK 
Sbjct: 84 EPEK2^VSEmSELWEKEMKLTDIRI^ALNSAHQLDQI^TMHNMQI£VDLIJ^AENDRLK 263 



LOCUS 

DEFINITION 

ACCESSION 
NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 



348 bp mRNA EST 07-JAN-1997 

Soares mouse lymph node NbMLN Mus musculus cDNA clone 



AA185349 
mu51c03 .rl 
642916 5' . 
AA185349 
gl769059 
EST. 

house mouse . 
Mus musculus 

Eukaryotae? mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrata; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; 
Mus . 

1 (bases 1 to 348) 



WO 98/24810 193/270 PCTYEP97/06956 



AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



mRNA 
BASE COUNT 
ORIGIN 

1 
61 
12X 
181 
241 
301 



Marra,M., Hillier,L., Allen, M., Bowles, M. , Dietrich, N., Dubuque, T . , 
Geisel,S,, Kucaba,T., Lacy,M., Le , M . , Martin, J., Morris, M., 
Schellenberg,K., Steptoe,M., Tan,F., Underwood, K. , Moore, B., 
Theising,B., Wylie,T., Lennon,G., Soares,B., Wilson, R. and 
Waterston,R. 

The WashU-BHMI Mouse EST Project 
Unpublished (1996) 

Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of MedicineP 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: mouseest8watson.wustl.edu 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (infoeimage.llnl.gov) for further information. 
MGI: 394908 

Seq primer: -28M13 rev2 from Amersham 
High quality sequence stop: 336. 

Location/Qualifiers 

1..348 

/organism" "Mus musculus" 
/ s tr ain= " C5 7 BL / 6 J w 

/note="Vector: pT7T3D-Pac (Pharmacia) with a modified 
polylinker; Site_l : Not I; Site_2 : Eco RI; [5* 

TGTTACCAATCTGAAGTGGGAGCGGCCGCGATACTTTTTTTTTTTTTTTTTTTTTTTT 
3 ' ] ; double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. RNA 
provided by Dr. Bertrand Jordan. Library constructed and 
normalized by Bento Soares and M.Fatima Bonaldo." 
/clone-" 642916" 

/clone_lib« M Soares mouse lymph node NbMLN " 
/3ex="male" 
/dev_stage«"4 weeks" 
/ 1 ab_hos t = " DH 1 OB " 
<1..>348 
93 a 95 c 78 g 82 t 



// 



attcggcact gaggggatga ataatccacc aaattagtgt gtacatagga gttgctgggc 
ccccccccac tcttatctgc tgtagctagc ctctccctaa gcctcgcatc ttctctaaat 
ctatctctgc gttcttacca cttgttctgg ccaatagaac tccggatcaa gaggcagaat 
tcctcagata gcatctccag cctcaacagc atcaccagcc attccagcat cggcagcagc 
aaagatgctg atgccaagaa gaaaaagaag aagagttggg taagtaaagg cttggagata 
ggcctgtgct aggagtcact caccctgttg cagggaactg accccttt 



gb|AA185349|AA185349 mu51c03.rl Soares mouse lymph node NbMLN Mus 
musculus cDNA clone 642916 5' 
Length = 348 



WO 98/24810 



194/270 



PCT/EP97/06956 



Plus Strand HSPa: 

Score - 154 (€9.4 bits), Expect - 4.4e-12, P o 4 . 4e-I2 
Identities - 27/42 (64%), Positives = 40/42 (95%), Frame - +1 

Query: 1656 DLRIRRQHSSDSVSSINSATSHSSVGSNIESDSKKKKRKNWL 1697 

+LRI+RQ+SSDS+SS+NS TSHSS+GS+ ++D+KKKK+K+W+ 
Sbjct: 157 ELRIKRQNSSDSISSLNSITSHSSIGSSKDADAKKKKKKSWV 282 



WO 98/24810 



195/270 



PCT/EP97/06956 



"SIM output with parameters: 
substitution scores in BLOSUM62 
O - 12, E - 4" 

Sequence 1: hul, 1702 residu 

Sequence 2: hu2, 2350 reside 

List of local alignments with score 100.0 



Pl6>m Ala. 



46.8% identity in 1726 residues overlap; Score: 2538.0; Gap frequency: 9.3% 

hul ' 78 DPESQRKRTVQNVLDLRQNLEETMSSLRGSQVTHSSLEMTCYDS — DDANPRSVSSLSNR 

hu2 ' 639 DPEARRLRTVKNIADLRQNLEETMSSIiRGTQVTHSTLETTFDTNVTTEMSGRSILSLTGR 

*** * *** * *************** ***** ** * ** ** * 

hul ' 136 SSPLSWRYGQSSPRLQAGDAPSVGGSCRSEGTPAWYMHGERAEYSHTMPMHSPSKLSHIS 

hu2 ' 699 PTPLSWRLGQSSPRLQAGDAPSMGNGYPPRANASRFINTESGRYVYSAPLRRQLASRGSS 

***** ************** * * * * * * 

hul, 196 RLEL-VESLDSDEVDLKS GYMSDSDLMGKTMTEDDDITTG 

hu2 ' 759 VCHVDVSDKAGDEMDLEGISMDAPGYMSDGDVLSKNI— RTDDITSGYMTDGGLGLYTRRL 

* ** ** ***** * * **** * 

hul ' 235 WDESSSISSGLSDASDNLSSEEFNASSSLNSLP 

hu2 • 818 NRLPDGMAWRETLQRNTSLGLGDADSWDDSSSVSSGISDTIDNLSTDDINTSSSISSYA 

** *** *** ** **** * *** * 

hul ' 268 STPTASRRNSTIVI^TDSEKRSIAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRR 

hu2 • 878 NTPASSRKNLDV — QTDAEKHSQVERNSLWSGDDVKKSDGGS— DSG-IKMEPG-SKWRR 

***** ****** * * *** ***** ***** 

hul • 328 ERPESCDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQ SALKVAGK P 

hu2 ' 932 NPSDVSDESDKSTSGKKNPVISQTGSWRRGMTAQVGITMPRTKPSAPAGALKTPGTGKTD 



hul ' 383 EGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSGIARPSTSG SFGYKKPP-PAT 

hu2 ' 992 DAKVSEKGRLSPKASQVKRSPSDAGRSSGDESKKPLPSSSRTPTANANSFGFKKQSGSAA 

***** ** ***** * # ## + 

hul ' 440 GTATVMQTG GSATLSKIQKSSGIPVKPVNGRKTSLDVSNSAEPGFLAPGARSNIQ 

hu2 ' i052 GIJVMITASGVTVTSRSATLGKIPKSSAL~VSRSAGRKSSMDGAQNQDDGYLAX*SSRTNLQ 

* * * **** «* •** * *** * * * ** * * * 

hul ' 49 5 YRSLPRPAKSSSMSVTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTTPA 

hu2 ' 1111 YRSLPRPSKSNSRNGAGNRSS TSSID-SNISSKSAGLPVPKLREPSKTALGSSLPG 

******* ** * * * **** ***** ****** * 

hul ' 555 PVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASHPTATKLAELPPTPLRATA 

hu2 ' 1 * 66 LVNQTDKEKGISSDNESVASCNSVKVNPAAQPVSSPAQTSLQPGAKYPDVASPTLRRLFG 



WO 98/24810 196/270 PCT/EP97/06956 



hul ' 615 KSFVKPPSLANLDKV-NSNSLDLPSSSDTTHAS — KVPDLHATSSASGGPL P 

hu2 , 1226 GKPTKQVPIATAENMKNSWISNPHATMTQQGNLDSPSGSGVLSSGSSSPLYSKNVDLNQ 

* * ** * * ** * ** 

hul ' 664 SCFTPSPAPILNINSASFSQGLELMSGFSVPKETRMYPKLSGLHRSMESIiQMPMS LP 

hu2 ' 1286 SPLASSPSSAHSAPSNSLTWGTNASSSSAVSKDGLGFQSVSSLHTSCESIDISLSSGGVP 

* ** * * * * * * ****** * * 

hul ' 721 SAFPSSTPVPTPPAPPAAP-TEEETEELTWSGSPRAGQLDSNQ RD 

hu2 ' 1346 SHNSSTGLIASSKDDSLTPFVRTNSVKTTLSESPLSSPAASPKFCRSTLPRKQDSDPHLD 

* * * * * ** * * 

hul ' 7 *5 RNTLPKKGLRY QLQSQEETKERRHSHTIGGLPESDDQSELPSPPALPMSLSAKGQL 

hu2 ' 1 4 °6 RNTLPKKGLRYTPTSQLRTQEDAKEWLRSHSAGGLQDTAANSPFSSGSSVTSPSGTRFNF 

*********** ** ** ww Witit * * 

hul ' 821 TNIVSPTAAT TPRITRSNSIPTHEAAFELYSGSQM-GSTLSLAERPKGMIRSGSF 

hu2 ' 1466 SQIASPTTVTQMSLSNPTMLRTHSLSNADGQYDPYTDSRFRNSSMSLDEKSRTMSRSGSF 

*** * * * * ******* ***** 

hul ' 8? 5 RDPTDDVHGSVLSLASSASSTYSSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANAN 

hu2 ' 1526 RDGFEEVHGSSLSLVSSTLSVYSTPEEKCQSE— IRKLRRELDASQEKVSALTTQLTANAH 



hul ' 935 LVAAFEQSLVNMTSRLRBIJVETAEEKDTELLDIjRETIDFIJCKKNSEAQAVIQGALNASET 

hu2 ' 1 5Q 5 LVAAFEQSLGNMTIRLQSLTMTAEQKDSELNELRKTIELLKKQNAAAQAAINGVINTPEL 



hul ' 995 TPK EIaRIKRQNSSDSISSLNSITSHSSIGSSKDADAKKKKKKSWVYELRSSF 

hu2 ' 1645 NCKGNGTAQSADLRIRRQHSSDSVSSINSATSHSSVGSNIESDSKKKKRKNW LRSSF 

* *** ** **** ** ** ***** ** * **»* * * ***** 

hul ' 1047 NKAFSIKKGPKSASSYSDIEEIATPDSSAPSSPKLQHGSTETASPSIKSSTLSSVGTDVT 

hu2 ' 1702 KQAFGKKKSPKSASSHSDIEE — TTDSSLPSSPKLPHNGSTGSTPLLRNSHSNSL 

** ** ****** ***** * *** ****** * * * * 

hul ' 1107 EGPAHPAPHTRIjFHANEEEEPEKKEVSEIJISELWEKEMKLTDIRI£ALNSAHQLDQIJRET 

hu2 ' 1755 ISECMDSEAETVMQLRNELRDKEMKLTDIRLEALSSAHQLDQLREA 

* * ***** ************* ********** 

hul ' 1167 MHNMQIiEVDLLKAENDRIiKVAPGPSSGSTPGQVPGSSALS~SPRRSLGLALTHSFGPSLA 
hu2 ' 1801 MNRMQSEIEKLKAENDRLK SESQGSGCSRAPSQVSISASPRQSMGLS-QHSLNLTES 

* ** * ********* * ** * # # w# 

hul ' 1226 DTDLSPMDGISTCGPKEEVT— LRVWRMPPQHIIKGDLKQQEFFLGCSKVSGKVDWKML 

hu2 ' 1857 TSLDMLLDDTGECSARKEGGRHVKIWSFQEEMKWKEDSRPHLFLIGCIGVSGKTKWDVL 

* * * ** * * * ** **** * , 

hu 1 ' 1284 DE AVFQVFKD YI SKMDPASTLGLSTE S I BGYS I SH VKRVLDAEPPEMPPCRRGVNN I 

hu2 ' 1917 DGWRRLFKEYIIHVDPVSQLGLNSDSVLGYSIGEIKRSNTSETPELLPCGYLVGENTTI 

* * ** ** ** * *** * **** ** ****** » 



WO 98/24810 197/270 PCT/EP97/06956 



*> ul » 1341 SVSLKGXJ^KCVDSLVFETLIPKPMMQHYISLLLKHRRLVLSGPSGTGKTYLTmU^AEYL 

hu2 » 1977 SVTVKGIAENSLDSLVFESLIPKPILQRYVSLLIEHRRIILSGPSGTGKTYLANRLSEYI 

** *** * ****** ***** * * *** *** ************ * # 

hul » 140 ^ VERSGREVTEGIVSTFNMHQQSCKDLQLYLSNLANQIDRETGIGDVPLVILLDDLSEAGS 

hu2 , 2037 VI*REGRELTDGVIATFNVDHKSSKEI*RQYLSNIJVDQCNSENNAVDMPLVIILDNLHBVSS 

* * *** * * *** * * * ****** * * * **** ** * * 

hul ' 1461 ISELVNGALTCKYHKCPYIIGTTNQPVKMTPNHGLHLSFRMLTFSNNVEPANGFLVRYLR 

hu2 , 2097 LGEIFNGLLNCKYHKCPYIIGTMNQATSSTPNLQLHHNFRWVLGANHTEPVKGFLGRFLR 

* ** * ************ ** *** ** # #w ^ #<r # ^ 

hul ' 15 21 RKLVESDSDINANKEELLRVLDWVPKLWYBLHTFLEKHSTSDFLIGPCFFLSCPIGIEDF 

hu2 ' 2157 RKLMETEISGRVRNMELVKIIDWIPKVWHHLNRFLEAHSSSDVTIGPRLFLSCPIDVDGS 

*** * ** ** ** * ** *** ** ** *** ****** 

hul ' 1581 RTWFIDLWNNSIIPYLQEGAKDGIKVHGOKAAWEDPVEWVRDTLPWPSAQQDQS KLYH 

hu2 ' 2217 RVWFTDLWNYSIIPYLLEAVREGLQLYGRRAPWEDPAKWVMDTYPWAASPQQHEWPPLLQ 



hu l' 1639 LPPPTVGPHSIASPPEDRTVKDSTPSSLDSDPI*MA>II*IiKLQEAANY 

hu2 * 2277 LRPEDVGFDGYSMPREGSTSKQMPPSDAEGDPLMNRLMRLQEAANY 

* * ** * * * * ** **** ** ******* 

WARNING: 49 local alignments have not been reported because of score < 100,0 



M SIM output with parameters: 
substitution scores in BLOSUM62 
O « 12, E = 4* 

Sequence 1: Cel, 1533 residues 

Sequence 2: hu2, 2350 residues 

List of local alignments with score >* 54.0 



32.8% identity in 504 residues overlap; Score: 490.0; Gap frequency: 6.9% 

Cel ' 1058 VIE IiK QE LKE R D S AL YE VRL DNLDRARE VD VLRE TVNKLK TE NK QLKKE VDK L TN GP ATR 

hu2 ' X766 VMQLRNELRDKEMKLTDIRIiEALSSAHQLDQLREAMNRMQSE IEKLKAENDRLKSESQGS 

* * ** ****** *** * * ***** 

CQl * 1 1 1 8 ASSRAS IPVI YD DEHVYDAACSST SASQSSKRSSGCNSIKVTVNV 

hu2 ' 1826 GCSRAp SQVSISASPRQSMGLSQHSLNLTESTSLDMLLDDTGECSARKEGGRHVKIWSF 

*** * ***** 

Cel ' 1163 DI AGE I SS I VNPDKE 1 1 VGYLAMS TSQSCWKDI DVS ILGLFE VYLSRIDVEHQLGIDARD 



WO 98/24810 



198/270 



PCT/EP97/06956 



hu2 ' 1886 QEEMKWKEDSRPHL-FLIGCIGVS-GKTKWDVLDGVVRRLFKEYIIHVDPVSQLGLNS-D 

* * * * * ** * * *** w 

Cel , 1223 SILGYQIGELRRVIGDSTTMITSHPTOILTSSTTIRMFMHGAAQSRVDSIjVLDMIJjPKQM 

hu2 ' 1943 SVLGYSIGEIKRSNTSETPELLPCGY-LVGENTTISVTVKGLAENSLDSLVFESLIPKPI 

* *** *** * * ***** **** * # „ 

Cel ' !283 I LQL VK S I LTE RRLVLAGAT G I GK SKLAK T LAA YVS IRTN Q — SE D S I VN I S I PE N NKEE 

hu2 ' 2002 LQRWSI^IEHRilllLSGPSGTGKTYIiAiniLSEYIVIiREGRELTDGVIATFNVDHKSSKE 

************ 

Cel ' 1341 LLQVERRLEKILRSKESCI VILDNIPKNRIAFWSVFANV-PLQNNEGPFWCTV 

hu2 ' 20 ^2 LRQYLSNLADQCNSENNAVDMPLVIILDNL— HHVSSLGEIFNGLLNCKYBKCPYIIGTM 

* * * * **** * * * 

Cel ' 1395 NRY — QIPELQIHHNFKMSVMSNRLE GFILRYLRRRAVEDEYRLTVQMPSELFKIID 

hu2 ' 2120 NQATSSTPNLQLHHNFRWVLCANHTEPVKGFLGRFLRRKLMETEISGRVRN-MELVKIID 

* * ** **** * * ** * *** * * # **** 

Cel ' 1450 FFPIAIiQAVNNFIEKTNSVDVTVGPRACLNCPLTVDGSREWFIRLWNENFIPYLERVARO 

hu2 ' 2 1 7 9 WIPKVWHHLNRFLEAHSSSDVTIGPRLFLSCPIDVDGSRVWFTDLWNYSIIPYLLEAVRE 

* * * * * *** *** * ** ***** ** *** **** * 

Cel, 1510 GKKTFGRCTSFEDPTDIVSEKWPW 

**u2, 2239 GLQLYGRRAPWEDPAKWVMDTYPW 



35.5% identity in 112 residues overlap; Score: 165.0; Gap frequency: 1.8% 

Cel ' 11 IYTDWANRHLSKGSLSKSIRDISNDFRDYRLVSQLINVIVPINEFSPAFTKRLAKITSNL 

hu2 ' 1 1 IYTDWANaYLTKSGHKRLIKDLQQDVTDGVLLAQIIQVVA-— NEKIEDINGCPKNRSQMI 

******* ** ******** ## 

Ce 1 ' 71 DGLETCLD YLKNLGLDCSKLTKTDI DSGNLGAVLQLLFLLST YKQKLRQLKK 

hu2 ' 69 ENIDACLNFLAAKGINIQGLSAEEIRNGNLKAILGLFFSLSRYKQQQQQPQK 

** * * * * *** ****** *** * * 



24.8% identity in 163 residues overlap; Score: 80.0; Gap frequency: 3.7% 

Cel ' 877 GSQLSLASTT — AYGSLNEKYEHAIRDMARDLECYKNTVDSLTKKQENYGALFDLFEQKL 
hu2 ' 1534 GS S 1»S L VS S TLSVY ST PEEK CQSE IRKIJRRELDASQEKVSALTTQLTANAHLVAAFEQSL 
* ** ** * * * ** w # 

CQl ' 935 ^LTQHIDRSNLKPEEAIRFRQDIAHLRDISNHLASNSAHANEGAGELUIQPSLESVASH 

hu2 ' 1594 GNMTIRLQSLTMTAEQK DSELNELRKTIELLKKQNAAAQAAINGVINTPELNCKGNG 

* * ***** w * 

Cel ' "5 RSSMSSSSKSSKQEKISLSSFGK-NKKSWIRSSLSKFTKKKNK 

hu2 ' J- 65 * TAQSADLRIRRQBSSDSVSSINSATSHSSVGSNIESDSKKKKR 

* ** * * *** 



identity in 31 residues overlap; Score: 74.0; Gap frequency: 6.5% 



WO 98/24810 199/270 PCT/EP97/06956 



eel, 653 gypdiIfedssslssgisdnnelddistddls 

hu2, 840 GDADSWDDSSSVSSGISDT — IDNLSTDDIN 

* * **** ****** * **** 



42.9% identity in 60 residues overlap; Score: 64.0; Gap frequency: 6.7% 

Cel, 984 RQPSLESVASHRSSMSSSSKSSKQEKISLSSFGKNKKSWIRSSLSK-FTKKKNKNYDEAH 

hu2 ' 166 * RQHSSDSVSSINSATSHSSVGS NIESDSKKKKRKNWLRSSFKQAFGKKKSPKSASSH 

*********** * **** ♦** » w 

22.0% identity in 91 residues overlap; Score: 56.0; Gap frequency: 0.0% 

Cel ' 140 SKLPSPRVATSATASATNPNSNFPQMSTSRLQTPQSRISKIDSSKIGIKPKTSGLKPPSS 

hu2 ' 177 SRIiSGPTARVSAAGSEAKTRGGSTTANNRRSQSFNNYDKSKPVTSPPPPPSSHEKEPLAS 

****** ** „ w „ 

Ce l> 200 STTSSNNTNSFRPSSRSSGNNNVGSTISTSA 

hu2, 237 SASSHPGMSDNAPASLESGSSSTPTNCSTSS 

* * * * ** *** 



WARNING: 44 local alignments have not been reported because of score < 54.0 



WO 98/24810 



200/270 



PCT/EP97/06956 




Figure 12b 
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1ig 13pCB201 (1 >S082) Site and Sequence Q ' , • /~ Pa 9 e 

Emymes : 100 of 146 enzymes (Filtered) <3 ) 



Settings: Linear, Certain Sites Only, Standard Genetic Code 

'■^CGGATCGGGAGArC TCCCGATCCCCTATGGTCGACTC TCAGTAC AATC TGCTCTGATGCCGCArAG"rAAGCC AG~ATC rGCrcCCTGC TTGTGTGT" 



CTGCCTAGCCCTCTAGAGGGCTAGGGGATACCAGCTGAGAGTCATGTrAGACGAGACTACGGCGTATCAArTCGGTCATAGACGAuGGACGAACACACA. 
T 0 R . E ' S , R S P M V 0 S 0 Y N L L . C R I V K P V S A P c L C "v 

GGAGGTCGCTGAGrAGTGCGCGAGCAAAATTTAAGCrACAACAAGGCAAGGCTTGACCGACAATTGCArGAAGAATCTGCTTAGGGTTAG GCGTTTTGC; 
CCTCCAGCGACTCATCACGCGCTCGTTTTAAATTCGATGTTGTTCCGfTCCGAAC TGGC TG T TA ACG T AC T TC TTAGACGAATCCC A ATCCGC AAA A*" G" 
G G R ■ V V R E ° N. L S Y H K A R L 0 R Q L H £ £ S A q . A F " Q ' 

C7GCTTCGCGAT GTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCA TAGCCCATATA 

GACGAAGCGCTACATGCCCGGTCrATATGCGCAACTGTAACTAATAACrGATCAATAArlATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATA- "° 
A A . S R C T C 0 ' » .» L T L | , Q L L , v , N y G V I S S . P I V 

TGGAGTTCCGCGT TACATAACTTACGGrAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAArGACGTATGTTCCCAT AGT 

ACCTCAAGGCGC AATGrATTGAATGCCATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTrATTACTGCATACAAGGGTATCA 
GVP R y I TYGKVPAVL T AQRPpp I 0VN N0VCSH3 

AACGCCAATAGG GACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTArCATATGCCAAG TACGCC: 

ttgcggttatccctgaaaggtaactgcagttacccacctgataaatgccatttgacgggIgaaccgtcaIgtagttcacatagtatacggttcatgcggo iM 

NANRQ FPL TSMGG LF TV'NCP L GST SSV S YAK Y A 

CCTArTGACGTCAATGACGGTAAATGGC CCGCCTGGCATrATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGrACATCTACG TATTAGrrA 
GGATAACTGCAGTTACTGCCATTTACCGGCCGGACCGTAATACGGGTCATGrACTGGAArACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCA.- 
" Y • R ° • R ■ H A R L A L C p V H 0 L M G L S r L A V H L R I S H 

TCGCTATTACCATGGTGATGCGoTrTT GGCAGTACATCAATGGGCGrGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATT GACGTCAA 

agcgataatggtaccactacgccaaaaccgtcatgtagttacccgc acctatcgccaaac tgagtgcccctaaaggttcagaggtggggtaactgc ag t" 730 

" ' ' " G ° ■ * V L * ; * 0 V A V , A V , L TGI S K- S P P H , R Q 

T G3GAGTTTGTTTTGGCACCAAAArc AACGGG ACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAArGGGCGGTAGGCGT GTACGGTGu''*A' : 

A^CCTCAAACAAAACCGTGjTT7TAGTTGCCCTGAAAGGTTTTACAGCATrGTTGAGGCGGGGTAACTGCGrTTACCCGCCArCCGCACArGCCACCCI'' ' ' '''' 
'•'EFVL aPKSTGLSKHS . Q L 3 P 10ANGR . AC f V 0 

G-rATATAAGCAGAGCTCTCTGOCTAACTAGA GAACCCACTGCTTACTGGCTTATCGAAArrAATACGACrCACTATAGGGAGACCCAAGC TGGC TAG1' 
CAGArATArTCGTCTCGAGAGACCGATTGATCTCrTGGGTGACGAATGACCGAATAGCTrTAATTArGCrGAGrGATATCCCTCTGGGTTCGACCGAT'*'" * W 

t — T7 oromoior onmino st!a I 



C * v >: " s S L A N . a T H C L L A r R N 



T7 oromoior priming silo 
"OSL.GOPSVL* 



GTTTAAACrTAAGCTTACCATGGGGGGTTC TCA^CATCATCATCATCATGGTATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGGGATCTGTACGA-- 
CAAATrTGAATT C GAArGGTACCCCCCAAGAGTAGTAGTAGTAGTAGTACCATACCGATCGTACTGACCA:CTGTCGTTTACCCAGCCCTAGACATG.-T: 

ET 



-ProBond binding domain ZJ 

F <LKLTMG 



SHhH HHHG MASMT GGQQMGRDL 



Y D 
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liq13pCB201 (1 >5082) Site and Sequence 



Page 



GA7GACGATAAGGTACCTAGGATCCATATGCCTCC TTGCCGTCGAGGTGTCAATAACATATC AGTC TCCC TCAAAGGTC 7GAAG3AGAAATGCG 7C3ACJ 



C TaCTGC TATTCCATGGATCCTAGGTATACGGAGGAACGGCAGC TCCACA3TTATTGTATAG TCAGAGGGAGT TTCCAGAC TTCC TC T77ACGC AG?!T^ 



I V 



0. 



- pub^ui men s u* 



-IMOHF 



QOOKVPRIHMPPCRRG 



N ISVSLKGLKEKCVO 



GCCTGGTGTTCGAGACGCTGATCCCCAAGCCGATGATGC AGCACTACATAAGCCTCC TGC TGAAGCACCGGCGCCTCGTCC TCTCGGGCCCCAGCGSC.*^ 




-U4 0RF 



3 U . V F E T L . 1 P * , P " " Q H Y 1 S L L LKHRRLVLSGPSGT 



GGGCAAGACCTACC TGACCAATCGCTTGGCCGAGTACCTGGTGGAGCGCTC TGGCCGTGAGGTCACAGAGGGCATCGTCAGCACCTTCAACATGC aCCAu 
CCCGTTCTGGATGGAC ^GGTTAGCGAACCGGCTCATGGACCACCTCGCGAGACCGGCACTCCAGTGTCTCCCGTAGCAGTCGTGGAAGTTG TACGTGGT1! 



- pCB201 insert a U4 



-U4CRF 



Q * T Y 1 T , N R L A E y U . V I « 5 G R E V T E G I V S T F M M H Q 

C A3 TC TTGC A AGGATC TGC AAC TGT ATC T TTCC A ACCTAGCCAACC AG AT AGACCGGGAA AC ACGAATTGGGGATGTGCCCC TGGTGA TTCTATTG3AT3 
'^-*S*ACGTTCCTAGACGTTGACATAGA^ 



-U4 CRF 



S c , K 0 I Q I YLSNLANO 1DRETGI 



P L V i L L 0 



AC w "3AGTGAAGCAGGCTCCATC AGTGAGTTGGTC AATGGGGCCCTCACC T3CAAGTATCAT AAATGTCCC TA TAT TaTAGGTACCACCAA T CAGC w 
"GltAC ^ A -^^GTCCGAGGTAGTCACTCAACCAGTrACCCCGGGAGTGGACGTTCATAGTATTTACAGGG AT ATAATATCC ATGGTGC- r T AGTCG"»AC A 




A A AAA TGACACCCAACC ATGGCTTGCAC TTGAGC 7 TC AG 3AT G T TG ACC T TCTCCAACAACGTGGAGCCA3CC AATGGC TTCC TGGTTCGTTACCT 3A'liij 



"TTAC TGTGGGTTG GTACCGAACGTGAACTCGAAGTCCTACAAC ^GGAAGAGGTTGTTGCACCTCGGTCGGTTAC CGAAGGACCAAGC AATGGAC t CC 

-pCB201 insert a U4 



"~ ' U4 nap — 

P NH GL H V 3F R ML T F3N M V E » A N G F L V * Y I. F 





WO 98/24810 



203/270 



PCT/EP97/06956 



BASE COUNT 
ORIGIN 



173 a 



168 c 



141 g 



128 t 



1 gggccctcta gggtgcctgc tgcaggaagc acagcatagg tccagggagc ctctaattta 

61 aataggagaa gtcagagctt taacagcatt gacaaaaaca agcctccaaa ttatgcaaat 

121 ggaaacgaaa aagattcctc caaaggacct caatcgtctt caggtgtaaa tggtaacgtg 

181 cagcctccca gtactgctgg gcagcctcct gcctctgcca tcccttctcc aagtgccagc 

241 aagccctggc gcacgaagtc catgaatgtc aaacacagtg ccacctccac catgttgact 

301 gtaaagcagt caagtacagc cacctccccc acaccatctt cagacagact gaaggcaacc 

361 tgtctcagaa ggggtcaaaa ctgctccctc aggacagaaa tccatgcttg agaaattcaa 

421 gctagtcaat gcccggactg ctttacgccc cccgcagcct. cccagttcag gacctagtga 

481 tggtgggaag gatgatgatg ccttttctga atctggtgaa atggaaggtt tHaacagtgg 

541 tctgaatagt ggtggctcaa caaatagcag tcccaaagtg tcacctaagt tggcccctcc 

601 aaaagctgga 



WO 98/24810 204/270 PCT/EP97/06956 



LOCUS 
DEFINITION 
ACCESSION 
NID 

KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



AA495042 418 bp mRNA EST 27-JUN-1997 

fa05f06.sl Zebrafish ICRFzfls Danio rerio cDNA clone 5D16 3*. 

AA495042 

g2225470 

EST* 

zebrafish. 
Danio rerio 

EuJcaryotae; mitochondrial eukaryotes; Metazoa; Chordata; 
Vertebrate; Actinopterygii; Neopterygii; Teleostei; Euteleostei; 
Ostariophysi; Cyprini formes; Cyprinoidea; Cyprinidae; Rasborinae; 
Danio . 

1 (bases 1 to 418) 

Clark, M . , Lehrach,H., Johnson, S., Marra,M., Eddy,S., Hillier,L., 
Allen, M. , Bowles ,L., Dubuque, T., Geisel,G., Jost,S., Kucaba,T. f 
Lacy,M., Le , N . , Lennon,G., Martin, J., Moore, B., Schellenberg,K . , 
Steptoe,M., Tan,F., Theising,B., White, Y., Wylie,T. f Waterston,R. 
and Wilson,R. 

WashU Zebrafish EST Project 
Unpublished (1997) 

Contact: Steve Johnson 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108 

Teli 314 286 1800 

Fax: 314 286 1810 

Email: estewat30n.wustl.edu 

Steve Johnson lab internal ID - P2_60 NOTE - For this library, the 
CLONE id field represents a position identifier on the original 
cDNA library preparation plate. cDNA Library Preparation: Matthew 
Clark. cDNA Library Arrayed by: Matthew Clark. DNA Sequencing by: 
Washington University Genome Sequencing Center Clone distribution: 
Genome Systems, St. Louis, and Max Planck Institut fuer Molekulare 
Genetik, Berlin Tel +49 30 84 13 1235 
Seq primer: -40ml3 ET from Amersham 
High quality sequence stop: 416. 

Location/Qualifiers 

1. .418 

/organism= "Danio rerio" 

/note="Vector: pSPORTl; Site_l : NotI; Site_2 : Sail; 1st 
strand cDNA was primed with a Not I - oligo(dT)15 primer 
( 5 ' pGACTAGTTCTAGATCGCGAGCGGCCGCCCTTTTTTTTTTTTTTT3 ' J , on 
mRNA from pooled 26 somite zebrafish embryos; 
double- stranded cDNA was ligated to Sal I adaptors ( BRL ) , 
digested with Not I and cloned into the Not I and Sal I 
sites of the pSPORTl vector (BRL). Library was constructed 
by Matthew Clark (Lehrach lab; ICRF, London and Max 
Planck Institut fuer Molekulare Genetik, Berlin) and was 
not biochemically normalised. 70,000 clones from this 
library were arrayed on high density filters and 
subsequently screened by oligonucleotide hybridization 
fingerprinting to identify unique or minimally redundant 
clones for more intensive analysis . " 



WO 98/24810 



PCT/EP97/06956 



205/270 



* Tuesday. IB November 1997 10:09 
fig 13 pC820i (1 > 5062) Site and Sequence 



Page 



AGGAAGC TGGTAGAGTCAGACAGCGACATC AATGCCAACAAGGAAGAGCTGCTTCGGG TGCTCGAC TGGGTACCCAAGC TGTGG TATCATC TCC AC ACC " 
TCC TTCGACCATCTCAGTC TG TCGC TGTAGTTACGGTTGTTCCTTC TCGACGAAGCCCACGAGC TGACCCATGGGTTCGACACCA TAGTaG AGSTG TG'jA 



- PCB201 insert a U4 



•U4QRF 



RKLVESOSO I N ANKEELLRVLOVVPKUVYHLHT 



TCCTTGAGAAGCACAGCACCTCAGACTTCCTCATCGGCCCTTGCTTCTTTCTGTCGTGTCCCATTGGCATTGAGGACTT CCGGACCTGGTTCAT7GAr 
AGGAACTCTTCGTGTCGTGGAGTCTGAAGGAGTAGCCGGGAACGAAGAAAGACAGCACAGGGTAACCGTAACTCCTGAAGGCCTGGACCAAGTAACTGGm 



•ao: 



•pCB201 insert = U4 



-U4 ORF 



F L E K H J T . S D F L I C P C F F L S C P I G 1 E 0 F R T V F I 0 L 

gtggaacaactctatcattccctatctacaggaaggagccaaggatgggataaaggtccatggacagaaagctgcttgggaggacccagtggaatgsgt: 



caccttgttgagatagtaagggatagatgtccttcctcggttcctaccctatttccaggtacctgtctttcgacgaaccc tc c tgggtc 



:accttac:ca3 




V N N S I 



PY LQEGA KOG I KVHGQKAAWeDPVEVV 



CGGGACACaCTTCCCTGGCCATC A5CCCAACAAGACCAA TCAAAGC TGTACCACCTGCCCCCACCCACCGrGGGCCC TCACAGCATTGCCTCACITL'CC:: 
GCCCTGTGTGAAGGGACCGGTAGTCGGGTTGTTCTGGTTAGTTTCGACATGGTGGACGGGGGTGGGTGGCACCCGGGAGTGrCGTAACGGAGTGCAGG^ 




= OT LPVPSAQ QDQ S KLYHLPPPTVG PHSIASPP 

AGGAT AG3ACAGTC AAA3ACAGCACCCC AAGTTC TCTGGAC TCAGATCCTC TGATGGCCATGCTGC TGAAAC T TC AAGAAGC TGCCAAC TA C AT'GAG TL' 
TCCTATCC-GTCAGTTTCTGTCGTGGGGTTCAAGAGACCTGAGTCTAGGAGACTACCGGTACGACGACrTTGA AGTTCTTCGACGGTTGATGrA^C TCcJ 

-pCB201 insert = U4~ 




£ ° R T V k °, S T ? S S L 0 S D P L M A fi L L K L 0 E A A H Y I E > 

tccagatcgagaaaccatcctggac:ccaaccttcaggcaacactttaagggttcggcaatcactgtcacccccggacagcagaacgc- G3cat:a3c 

AGGrCTAGCTCTTTGGTAGGACCTGGGGTTGGAAGTCCGTTGrGAAATrcCCAAGCCGTTAG TG AC AGTGGGGGCC TGTCG TC TTGCGACCG TAGT IGA 1 



— U4 OFF 

p 0P£TlLD5 NL0ATL 



- pC8201 insert = U4 



GFGMHCrlPR T A E R \V H 0 L 



WO 98/24810 PCT/EP97/06956 
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"Tuesday. 18 November 1997 10:09 

fig13pCB201 (1>S0a2) Site and Sequence age 

KTTAGCTCCTCCTCTCCCCTCTCC^ 

agaatcgaggaggagaggggagaggagaaagtctcgtgaccgagaggtcggggtcctcctcttgtcctccc tcctcctctactttctcctccctgtccaa " : ' 



- pCB201 insert a U4 



s ; L , L L s p L L F , Q f T g s p a p g g e q e g g g o e r g g t g 
ctt^gc^ 

gaaccacgacatggaaactcttgaaggatccttccttaccaccccaccgcaaacccttgaacacgggggatttgtgtaaatgaccggaggagat^ 



- PCB201 insert a U4 



S V C C T F E , N F L G R N G G V A F G N L C P L NTFTGLL. . L 
TTGGGGAAAAGATGArTCTGGGTCTTTCCCTTGACTTCTTGTTTCAATTACAAACTCCTGGGCTTTCTGGGGAGGGGT 

aaccccttttctactaagacccagaaagggaactgaagaacaaagttaatgtttgaggacccgaaagacccctccccaagtcttttgtag 



-pCB201 insert = U4 



V G K 0 0 S G S F P , L L V S 1 T N S V A F V G G V Q K T S K H C 

agcagttcctaaatgattctcacaagcaaccctgagagagacagtcttgtgagggagatctgggggaggcaggaagctcctcaga 
tcgtcaaggatttactaagagtgttcgttgggactctctctgtcagaacactccctctagaccccctccgtccttcgaggagtctaaaagagtgtc 2604 



- pCB201 insert a U4 



SSS .MllT SN PER 0SLVRE1VGR 0E A P 0 I F 3 Q T 
TTCCCAATTCCATCACCAC TGCCAACAC ^G^CCGGAArTCTGCAGArATCCAGCACAGTGGCGGCCGCTCGAGTCTAGAGGGCCCGTTTAAACCCGCT3 

aagggttaaggtag tggtgacggttgtgagcaggccttaagacgtctataggtcgtgtcaccgccggcgagctcagatctcccgg^ r;/ 

_ > 

pCB201 insert = U4 ' 

L P N 3 1 T T A N T R P F C R Y P A 0 V R p L E S R G P V . T R 

ATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTT TCC 
TAGTCGGACCTGACACGGAAGATCAACGGTCGGTAGACAACAAACGGGGAGGGGGCACGGAAGGAACTGGGACCTTCCACGGTGAGGGTGACAGGAAAGi 
3 A 3 T V P S S C Q P S V V C P S P V P S L T L E G A T P T V U S 

TAATAAAA.GAGG AAATTGCATCGCATTGTCTGAGTACGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATrGGGAAGACA 
ATTArTTTACTCCTTTAACGTAGCGTAACAGACTCATCC ACAGTAAGATAAGACCCCCCACCCC ACCCCGTCC rGTCGTTCCCCC TCCTAACCC TTCTGT * V 

** . ^ ^ i^shc lsrchs ilgggvgqdskgedve d 

ATAGCAGGCATGC TGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTG7AGCGGCG: 
TATCGrCCGTACGACCCCTACGCCACCCGAGATACCGAAGACrCCGCCTiTCTTGGTCGACCCCGAGATCCCCCATAGGGGTGCGCGGGACAKGCCGCG *** 
W S " " A G 0 A V G 5 " a S E A E R T 5 V G 3 R G Y P H A P C S G A 

ATTAAGCGCGGCG GGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCrAGCGCCCGCTCCTTTCGCTTTC TTCCCTTCCTTTCTCGCC 
TAA7rCGCGCCGCCCACACCACCAATGCGCGTCGCACTGGCGATGTGAACG3TCGCGGGATCGCGGGCGAGGAAAGCGAAAGAAGGGAAG3AAAGAGCGi ^ 
^-SAAGVVVTRSVTATLASALAPAPFAFFPSFLA 

ACGTTCGCCGGC" TTCCCCGTCAAGCTcrAAATCGGGGC ATCCCTTTAGGGTTCCGATTTAGTGCTTTACGGC ACC TCGACCCCAAAAAAC "TGAT TAGU 
TGC AAGCGGCCGAAAGGGGCAGTTCGAGATTTAGCCCCG rAGGGAAATCCC AAGGCT AAATCACGAAATGCCG rGGAGCTGGGGT TT TTT3AAC TAATCC M '* 
7 F A C ? ? R 0 A L N B S ! P l GFRFSALRHLOPK. RLO. 



WO 98/24810 PCT/EP97/06956 
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Tuesday. 1B November 1997 10:09 p age 
fig 13 pCB201 (1 > 5082) Site and Sequence 

GTGATCjGrTCACCTAGTGGGCCATCGCCCTGATAGACGGrTTTTCGCCCTTTGACGrrGGAGTCCACGTTCTTrAATAGTGGACTCTTGTTCCAAACTG-i 

' ■-» '» | il 1 I i | i | , , £ y; 

CACTACCAAGTGCATCACCCGGTAGCGGGACTATCTGCCAAAAAGCGGGAAACTGCAACCTCAGGTGCAAGAAArTATCACCTGAGAACAAGGrTTGALL* 
G 0 G $ R S G P S P . T V F R PLT L E S T F F N S G L L F* Q T G 

AACAACACTCAACCCrATCTCGGTCTATTCTTTTGATTTATAAGGGATTTrGGGGATTTCGGCCTATTGGTTAA AAAATGAGCTGATTTAACAAAAATTT 

rTG ttgtgagttgggatagagccagataagaaaac taaatattccctaaaacccctaaagccggataaccaattttttac tcgac i"aaattgtttttaaa " V "" 



T T LNP ISVYSFOL.GILGISAYVLKNEL I 



Q K F 



AACGCGAATTAATTCrGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGrCCCCAGGCTCCCCAGGCAGGCAGAAGTATGCAAAGC ATGCATCTCAATTAGT 

ttgcgcttaattaagacaccttacacacagtcaatcccacacctttcaggggtccgaggggtccgtccgtcttcatacgtttcgtacgtagagttaatca ^ 

'^AM .FCGM CV S.G VE SP Q APOAGR SMOSMHLN 

cagcaaccaggtgtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcatctcaattagtcag caaccatagtcccgcccctaactcc 
gtcgttggtccacacctttcaggggtccgaggggtcgtccgtcttcatacgtttcgtacgtagagttaatcagtcgttggtatcagggcggggattgag2 JCw * 

S A T R C G K S P G S P A G R S M Q SMHLN.SATIVPPLTP 
GCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCrCCGCCCCATGGCTGACTAArTTTTTTTATTTATGCAGAGGCCGA GGCCGCCTCTGCCTCT 

cgggtagggcggggattgaggcgggtcaaggcgggtaagaggcggggtaccgactgattaaaaaaaataaatacgtctccggctccggcggagacggagm 3 

P { P , P I T p P S S A H $ P P H G L I F F I YAEAEAASAS 

GA5CTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTCCCGGGAGCT TG ta TATCCATTTTCGGATCTG ATCAAGAGA 

ctcgataaggtcttcatcactcctccgaaaaaacctccggatccgaaaacgtttttcgagggccctcgaacatataggtaaaagcctagactagttctct " c ~'" 
E L F Q * ■ ggffgglgfckklpgac isifgsoqe 



caggatgaggatcgtt tcgcatgattgaac aagatgg at tgc acgcaggt tctccggccgcttgggtggagag gctattcggc tatgac tgggcac aac a 
g-ctactcctagcaaagcgtactaacttgttctacctaacgtgcgtccaagaggccggcgaacccacctctccgataagccgatactgacccgtgttg" 

T 5 • G s F R " I E Q D G L H A GSPAAVVERLFGYOVAGti 

gacaatcggctgctctgatgccgccgtgttccggctgtcagcgcaggggcgcccggttctttttgtcaagaccgacctg tccggtgccctgaatgaact^ 
•:"37tag:cgacgagactacggcggcacaaggccgacagtcgcgtccccgcgggccaagaaaaacagttctggctggacaggccacgggacttacttga: 
" 1 Gc s D a a v f j* l s AO g r p v l f v k t o l s g a l m e l 

CA3GACC AGGCAGCGCGGC tatcgtggc t"ggccacgacgggcgttccttgcgcagctgtgctcg acgttgtcactgaagcgggaagggactggc TGCTA' 
gtcc ^gctccgtcgcgccgatagcaccgaccggtgctgcccgcaaggaacgcgtcgacacgagctgcaacagtgacttcgccc ttccctgaccgacgata ^ f ' 
P - A * R L s v lattgvpcaavlovvteagpovll 



~'jqGCG£~G TGCC5GGGCAGGATCTCC TGTCA TC TCACC TTGCTCC TGCCG AGAAAG TATCC ATC ATGGC TGA TGCAATGCGGCGGCTGC ATACGC TTGA 
ACCCGCT TCACGGCCCCGTCC TAGAG6ACAGTAGAGTGGAACGAGGACGGC TCTTTCA rAGGTAGTACCGACTACGT TACGCCGCCGACGTATGCGAAC " 
L G EVPGQ QLL SSHLAPA EKV S IHADA H R R L H T L D 

~CC GGC T ACC T GCCCATTCGACCACCAAGCGAAAC AfCGCATCGAGCG AGC ACGTAC TCGGATGGAAGCCGGTC TTGTCGA TC A GGATGAfC TGGACGAA 
AG2CCGA TGGACGGG rAAGCTGGTGGTTCGCTTTG TAGCGTAGC TCGC TCG TGCATGAGCC TACCTTCGGCCAGAACAGC TAG TCCTAC TAGACCTGC TT " J% "'* 
P A T C P F 0 H Q A K H R I E R A R T R M E AGLVOQDDLDE 

GA'JtZATC AGGGGCrCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGC ATGCCCCACGGCGAGGATCTCGTCGTGACCCA TGGCGATGCC fGC TT.;;: 
CriGrAGrcCCCGAGCGCGGTCGGCTTGACAAGCGGTCCGAGrTCCGCGCGrACGGGCTGCCGCTCCTAGAGCAGCACTGGGTACCGCTACGGACGAACc 
£HQ CLAPA£L farlkarmpdgeolvvthgdacl 
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Tuesday. 18 November 1997 10:09 Page 
fig 13 pCB201 (1 > 5082) Site and Sequence 

CGAATATCATGG TGGAAAATGGCCGC TTTTCTG GATTCATCGAC TG TGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGA 
GC TTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGC TGACACCGGCCGACCCAC ACCGCC TGGCGATAGTCCTGTATTGCAACCGATGGGCACT 
P N 1 M V S N G R F SGF IDCGRLGVADRYQD I A L A T R D 

rATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGAT TCGCAGCGCATCGCCTTCTATCGCCTTCTT 
ATAACGACTTCTCGAACCGCCGCTTACCCGACTGGCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAA 
f A E E L GG EV A ORFLVLYG IAAPOSOR IAFYRLL 

GACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGA TTTCGATTCCACCGCCGCCTTCTATGA 
CTGCTCAAGAAGACTCGCCCTGAGACCCCAAGCTTTACTGGCrGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACT "' V '" 
OEFF AGLVGSK PTKRRPTCHHE IS IPPPPSM 

AAGGTTGGGCTTCGGAATCGTTTTCC6GGACGCCGGC TGGATGATCCTCCAGCGCGGGGATC TCATGCTGGAGTT CTTCGCCCACCCCAACTTGTTTATT 
TTCCAACCCGAAGCCTTAGCAAAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGGTTGAACAAATAA 
K G V ASS ^ F S G T P A G . S S S A G I S CVSSSPTPTCLL 

GCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAArAAAGCATTTTTT TCACTGCATTCTAGTTGTGGTTTGTCCAAAQTCATCA 
CGTCGAATATTACCAATGTTTATTTCGTTATCGTAGTGTTTAAAGTGTTTATTTCGTAAAAAAAGTGACGTAAGATCAACACCAAACAGGTTTGAGTAGT 



OLIMVTNKAIASQISQIKHFFHCILVVVCPN 



S 3 



ATGTATCTTATCATGTCTGTATACCGTCGACCTC TAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAAT 

' * 1 1 " 11 11 1 1 1 ■ ■ ' 1 1 ...» - - i - . . i t - ■ ■ \ t i soCi"* 

rACATAGAATAGTACAGACATATGGCAGCTGGAGATCGATCTCGAACCGCATTAGTACCAGTATCGACAAAGGACACACTTTAACAATAGGCGAGTGTTA " * 
M Y 1 M S V Y R R P L A R A V R N H G H S C F L C £ 1 V I R S Q 

TCCACACAACATACGAGCCGGAAGC A TAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCAC ATTAATTGCGTTG 

«-■■■> ' ' i i i . ■ ■ i i i . . | . . ; . | r . u 50 A2 

AGG TGTGTTGTATGCTCGGCCTTCGT AT TTC AC ATTTCGG ACCC C ACGGATT AC TC AC TCGATTGAGTG TAATTAACGCAAC 
FHTTYEPEA. SVKPGVPNE A N S H LRV 



WO 98/24810 



209/270 



PCT/EP97/06956 



V360 v370 w380 v390 v400 V4I0 v420 v430 v440 

S ^**VTETPAHSVPHTRIJQAKKXrKPSXXXVSK^ 

10 * 20 *M MO -50 -60 -70 -80 -90 

ir450 V4«0 v470 v480 v490 v500 v510 v5 20 V S30 

100 * U0 *«0 -130 -140 -ISO -X60 -170 -180 

V540 VJSO v560 vS70 v580 v590 V600 „610 v620 

-190 -200 -210 -220 -230 -240 - 2S0 -260 "270 

V630 v640 V650 v660 v670 v680 v690 v700 V 710 

"° " 29 ° ' 30 ° *»<> "320 -330 -340 -3S0 -360 

V "° V "° tf74 ° v7S ° V760 V770 V780 v790 V800 

"° 380 • 39 ° -««0 -410 -«20 -430 -440 -450 

V810 v820 V830 v840 v850 v860 v870 vS80 vB90 

460 470 •" 80 •**> -500 -510 -520 -530 "540 

V900 v910 V920 v930 V940 v950 v960 

I^PPTVCPBSIASPPEDRTVXDSTySSLnSDPLMWajJOCEAAKYrESPDRrTILDPKUJATI. 
LPPP: VGP8S . ASPPEDRTVKDSTP : SLOSDPLMAMmXQEAAKKIESPDRZTJQLDPNIjC'ATI. 
U>PPSVGPUSTASPPEDRWF^STPHSLnSDPIJlAKIXJa^EA.^IESPDReTII.DPNl«ATI. 
-550 -560 -570 -580 '590 -600 



WO 98/24810 PCT/EP97/06956 
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SnaB I Nco I 

-Ar3GGACTTTCCrAC7TG3CA GTJ.CATCTACGTATTAGT:MTC3CTA7T4 > CCAT357GATGCGGTTTTG 

1 — — — ■ "?o 

GC AG7ACATCAATo3GCGT3 GAT^3CGGTTTGACTCACGG3GATTTCCAAGTC7CCACCC:AT-GACG7C 

— — — — — . , ' - u -yQ 

A«T5GGAGTTTGTTT-GGCACCAAAA-CAACGGGACTT7c:aaAATG7CGTAACAAC7CC3CCCCATTGA 
1 1 1 ' - ' ■ 210 

CG:aaA73GGCGGTAGG C3TG'AC33'GGGAGGTCTATATAAGCAGAGCTCTCTGGC7AACTAGAGAACC 
1 1 ' 1 '■ — — • ' — — 2SO 

L £ N 
Asp 718 



Hind III 



Kpn I Bam HI 



CACTGCTaC TGGC7~ATC3 AAAT7A AT ACGAC TCACTA TAGGGAGACC CAAGCTTGG7ACCGAGCTCGG 

P: -- t 3L5nLIRLT:3R3<L3"ELG 

BstX I 

Spe I Xma III EcoR I Pst I BspM II 

A , r::AC"A5TAAC33CCG::AG73T3C "3GAATT:TGCAGA7 < "ATGCCA7CAA777CCGGATCTCAAGGA 

' 1 — *20 

$~SN33.:CiG!L0lM?5 ISGS03 

^ps:sgsog 

ApaL I 

AC TC ^"G AC A AC A T TG A7G TG A 7 T3 AG 7 TG A AGC A AG AG C TCAA AG A ACGCG A 7 AG TGC AC T77ACGA AG 

1 ~ ~ 1 1 1 — ■ ago 

i" L D N I D V IE-K0ELKE3CSALYE 
T| -DN!C7 lELKJELKE^DSAL YE 

■CC3CC'TGACAATC"GGATC G7GCCCGC3AAGrTGArGTTC^GAGGGAGACAG7GAACAA6T7GAAAAC 

— ' ' ■ £60 

v R - C N L C ^ A P £ 7 D V _ P E T V N K _ K 7 
V ft - 0 N U C n A P £ V 0 V _ R E T V M K L K 7 
CGAGAAC AAGCAA fT aaaGa aaGAAG 7GGAC AAAC TC ACC AACG 3 TCCAGCC AC TCG 7GC TTC T7CCCGC 

- ' 1 1 £20 

t**0-.'* E 7 C L T N G » A T 3 A S S 9 

£ N K 0 L E 7 C »' L T N G - A 7 2 A S 5 R 

GCCTCAATTCCAo T " i "C 7-!GAC3«- ~G A3C A~37C ~A~GA 7GC A3CG73 TAGCA3TACA TC AGC TAG7C 
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Asu M 

A A TC 77CGAAAC 3ATCC 7C 7GGC 73C AAC7C AATC AAGG T7 AC 7G7AAACGTGGAC A TCGC7GGAGAAA7 

CSSK3.SS3CNS f K V T V N V C I A 6 E I 
C5SK3SS5CN5I<VTVNtfCIAG.c \ 
Pvu I 

Hpa I EcoR V 

i i 

CA3TTCGATCG7TAACCCS3ACAAA3AGATAATCGTAGG A7A7C TTGCC - TG T C A AC CAGTC AG TC A TGC 
i . . . , 1 , 5a0 

SS ! V M P D K £ I IV3YLAHSTSQ5C 

SS I 7 N P Q K E I IV3YLAMST30SC 

"G3AAAG ACA T "3ATGT7TC 7A77C7AGGAC 7A7TTGAAG7C7ACC TA7CCAGAATTGA7GTGGAGCATC 
, u 910 

K D I J V S ! L G L ~ E V Y L S R i C V £ H 
tfK3:3VS!L3LrEVYLSQIuV£h 
Cla I Mlu I 

AAC77GGAATCGATGC7CG7GA77C7A7CCT7GGC7ATCAAAT7GG7GAAC7^CGACGCGTCA7"3GAGA 

. . ' • ■ ■ sso 

C I 3 : D A R j S IL 3YCIGE.P5 VI30 
CL3:DARDSlLGYCIGE.Rn Vl30 
CCC ACAACC A'GATAACC AGCC ATCCAAC7GACA7 7C7TAC 77CC TCAAC 1 ACAATCCGAA TG"T TCA TG 



s • • * : r s h p - o : i tss'tirmfm 

S"~ v :r3H3 r D;L7SS"TIRMF M 
CAC5G T GCC5CACAGAG7C GCG'AGACAG7C7GG7CC77GATA7GCT7CTTCCAAAGCAAATGA7TC7CC 

HGAA35R7 0SL7L D M L ?<C MIL 

HGAAQSRV0SL7L0ML.P<CM!L 
AACTCG7CAAG"CAAT77TGACAGA3AGACG7C7GG7G7TAGC7GGAGCAAC 7GG AA" GGAAAG AGC AA 



0 L 7 K 5 
C L 7 K S 


: u r 
: l r 


ERRLVLAGATG 
ERRLVLAGAT.3 
ASU II 


I G K S K 
i G < S K 


AC TGGCGAAGACCCT 


GGCTGCT' 


2fG7AT:TATTC3AAC AAATC AATCC 3AAGA" 


-o'ATTGTTAATATC 


- A K T L 
L A K T L 

Bsm 1 


^ A 
— A 


'* v* i 1 P T N 0 S £ :• 
'•V5IRTM0SE- 


> 7 N 1 

Bgl II 


•iGC A~7CCT3AAaAC, 


AAT A~A£A 


iGAATT3Cr-;AAG7G3-ACGAC3CCT:3AAA 


i3A"C TTGAG--GCA 


S ! P E N 
s ; • ? E >J 
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ElLOvERR.E 
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* ! L 3 c 
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Ava HI 

Nsi I Xba I 

:• 

AA3AATC ATGCA TC37AATTC~AGATAATATCCCAAAGAA TCGAATTGC ATT7G7T3 7A~CCG~7TTTGC 

' ... . 

K E S C IV : - 0 M ^ < N ^ : A - V 7 £ 7 F A 

< - S C i •/ : . C N 5 P < N ; f A f V V S ✓ F A 

EcoR V 

AAATG'CCCACTTCAAAACAACGAAGGrcCAT'ToTAG TATGCACAGTCAACCGATATCAiATCCCTGAG 

% J v p l c n n e g p - v v : t v n r y c IP£ 

M V P „ C M E G P V V Z ~ V N P Y C I P £ 

C~TCAAATTCAC CAC AA7 TTCAAAAfGTCAG 7AATG ""CGAA7CGTCTCG AAGGA7TCA7C CTACG TTACC 



-00 



*5<I0 



ClH-iNF<MSVMSN*L 
0 I H H N F sMSV. v S\ ^L 



L 3 



~C33ACGACGGGC33 TAG AGG A "3 AG TATCG T CTA.sCTG T AC AG ATGCCATC AG AGC7C TCAAAATCAT 

— — ■ 1 ■ ■ ' ■ ■ ■ , .... a i 

LR3RAVE3EYR-.TV)M3Se'.r< ! : 

EcoR I 

^GACT'CTTCCCAATAGCTCT-CAG3CC3TCAATAATTTTA-^GAGAAAAC3AATTC"G"T3A-GTGACA 



•610 



'660 



^ F F ? A l C A V H N F I E K 7 >j 5 v 0 V T 

!> F F ? ALQAVMK'- ! E K r V S 7 D V T 

Bam HI 

GTTGGTCCAAGAGCATGCTTGAACTGTCCTCrAACrGTCGATGGATCCCGTGAATGG ■ -CAFTCGATTGT 
' 1 1 ' 1750 

*G 3 R-CLNC0 L TVCGS3EVF ■ 5 
VGPR^CLNC-LTVC3S3E VF : * L 

GGAATGAGAACT TCATTCCATAT TTGGAACGTG TToCTAGAGATGGCAAAAAAACC TTCG3TCGC TGCAC 
« ■ . , , ; 320 

V N £ N r I ? n L£RVAR}G k '. < ~ S Q C 7 

'« N £ N - : P L £ R 7 A P l * ✓ - r : : q - - 

Bam HI Tth I 

'T::T:C3AGGAT::CACC3ACATC3:CTCTMAAAAA'G3;CG:33T:C3ATGG*3AAi-::CGGAGAA T 
1 ■ — — 1 5 890 

S F E 0 P r C 3 ^ v ^ v F c 3 £ - E X J 

s f e ? p r c : v s *' k * ? -.- f c :• i --. 3 e m 
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■960 



2030 



Tthl 

G'3C"CAAACG-rCTTCAAC7CCAAGACCTCGTCCC3"CACC7GCCAACTCATCCC3ACAACACTT CAATC 

VLXR.OLQCLV. 95PANSS9jQHFN 
'SLrfR.CLQCLVaSPANSSICHFN 

Ava 1 

Xho t 

CC:TC'3A3TCG"T3A TCC^AT-GCATGCr-XCCAAGCATC AGACCATCGACAACATTTGAACAGAAGACTC 

PL£S-. !0L-(A7<h0TI0N!. 
P L t S L SQL4ATKHQTIDN I 

Asp 718 
Kpnl 

"AiTC'^CTCTCGCCTCTCCCCCSCTT-CCTTATCT-CGTACCGoTACCTGATGArTCCCCATTTTCCCC 
. ■ • ' > 2!00 

Ava I 
Xma I 

! Sma I 

I ! 

C7TT7CCCCCCAATT~CCC AGAACC TCC TGT7CCC7"7GTTCCTAGTCC TCCCGGGTGCC3ACGCC3AAG 
« 1 . . 2170 

CGATT-AAAAACCTTTT T CrrTC:GAAACAT7'C:CA77GC*CATTAArAGTCAAATTGAATAAACAGTG 
. 1 . — ' 2240 

Ora II 
Ora II 
s Pss I 
: Apa I 
Pss I 

i 

"'AT37ACTTAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGCCC7ATTCTATAGTG7CACCTAAATGCTAGA 

2310 



Bel 1 

GC TCGC 73ATCAGCC 7CGAC TG7GCC ~7 


ctagttg:cagccatctgttgtttgcccctcccccgtgcctt 


CCTIGACCC rGGA^GG7GCC ACTCCCAC 


TG 7CC T77CC 7AATAAAA TGAGGAAA 7T3CA7CGCA 7TGTC 7 


GA3TuOG T3TCATTC "ATT: ~GG33GG~ 


333G7GG33CAGGACAGCAAGG5GGAGGATTGGGAAGACAAT 


AGCAGGC ATGC "GGG6A~3CGG~3G 3C 7 


Pvu II 

C TA7GG:""C T 3AGGCGGAAAGAACCA3C"GGGGC7C taggg 


GGrA-.:CCCACGC3CCC7GTAGC3GCGC 


A TT AAGC3CGGC3GGTGTGG7GG7 TAC 3CGCAGCGTGACCGC 



2380 
2«60 
2520 

2590 
2560 
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Nae I 

•ACAC:T3CCAGC3CCCTA6C GCCC3CrccrTTCGCTTTCT'CCCTTCCTTTCTCGCCAC3TTCGCCGGC 

' ' ' *- ' — 2730 

•"T:CCCGrCAA3CrCTAAA rCGGGGCATCCC T rTAGGGTTCCGATrTAGTGCTTTACGGCACCTCGACC 

' ' ' 1 ' 1 2600 

Dra III 

I 

CCAAAAAACTTGAr7A GGGTGA7GGTTCACG7AGTGGGCCA7CGCCC'GATAGACGGTTTTTCGCCCTTT 

' " * 1 •• . 2670 

GAC3TTG5AGTCC~CG TTCTT7mATAGTGGACTCTTG77CCAAACTGGAACAACACTCAACCC7ATCTCG 

1 ' ' ■ ■ — 2S<J0 

G7CTATTCTTT73Ar7TATA AGGGATTTTGGGGArTTCGGCC7ArTGG7TAAAAAArGAGCTGA7TTAAC 

- ' ! 1 ' 20 10 

AAAAA77TAACGC3AA77 AA77C73TGG AATGTG7G 7CA37"AGGG7G~33AAAG TCCCC AGGC7CCCCA 

1 1 1 ■ ' — 2080 

Ava Ml 
Nsi I 

GG:aGGCAGAAGT^TGCAAA GCAT3:a-CTCAATTAG7CA3CAACCAGG73TGGAAAG7CCCCAGGCTCC 

' 1 ■ — ■ 3150 

Ava III 
Nsi I 

CC A3CAG3CAGA~GrA7 GCAAAGCA TGC ATC 7C AAT "TAGTC AGCAACCA TAG 7CCC3CCCC7AAC TCCGC 

1 1 1 — 3220 

Nco I 

CC ATCCCGCCCC:AAC7C CGCCCAG7TCC3CCCArrC"CC3CCCCATGGC73AC7AA777rT77TATTTA 

■ — ' i — . , , 3290 

Stu I 
: Avr II 

~GCAGAGGCCGAG3CCGCCTC 'GCCTC7GAGC7ATTCCAGAAGrAGrGAGGAGGCTT T T'TGGAGGCCTA 

1 1 ' ■ i — 336O 

Ava I 
Xma I 

Sma I Bel I 

GGC::ttgCAAAAAGC 'CCCGGGA3C' t GTaTaTCCA7'TTCGGATC7GATCAAGAGACAGGATGAGGA7 

- 4 ' 1 1 3«30 

Xma III 

■:GT7-CG:ArGATT3AAC i^GATGGA7TGCACG:AGG'7:TCCG3CCGCTr3GG:3GAGAGGC-ATTCGG 

■ . , . 3500 

Nar l 
Bbe I 

C'ATGACr3GGC~: ^ACA3^CAArC3GC fCC "C TG<i TGCCGCCG 7G T ~C CG3C TG r: AGCGC AGGGGCGC 

' ' ■ ' 2570 
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Pstl 

CC3G7~C 7777~37CAAGACCGA3C7G7CC3G"3C3C7G AA7GAAC7GC AGGACG A3GCAGCGCGGCTA7 
. ■ i ■ 1 ' 36 UO 

Fsp I 

Bal I Pvu II Tth I 

CGT3GC"3GCCACGACGG3CG""::T~GCoCAGCT3"GCT:GACGTTG"CACTGAA3CGG3AAGGGACTG 
. . . . . 37 ](> 

GCToC"ATT3GG:3AAGTGCCGG33:aGGATC~C:TGTCATCTCACC3CTCC'3:CGA3AAAGTATCC 
= ■ ' ' ' 3760 

ATCATGGCTGATGCAATGCGGCGSCTGCATACGCTTGATCCGGCTACCrGCCCATTCGACCACCAAGCGA 
i . . . . ' ' 3850 

AACA T C'3CAT"C£AGCGAGCACG~AC ~C G 3A TGG A ASCCG 3 TC T 75 TCG A TCAGGA 7 3 ATCT3GACGAAGA 
. . ; , 3920 

BssH II 

GCAKAG33GC-:3CGCCAGCCG-AC-GTTCGCCA3GC-:AAGGCGCGCATGCCC3ACGGC3AGGATC7C 
1 : 2990 

Ncol 

G7C37GACC:A"'3GCGA"GCC"G::7GCC3AA' r iTCA7G3TGGAAAA7G3CCGC7:7"C73GA7TCATCG 
■ ■ (.060 

Nae l Rsr II 

ACT37GGCC3GCT33G7G73GCG3a:C3CTA7CA33ACATAGCGTT3GCTACCCG'3A7AT7GC7GAAGA 
— 1 ' 1 1 0 130 

GC7TGGC33:GAiT3GGC73ACC::T'CC7:G"3:7-~AC3G7A7C3CC3CTCCC3A7-C3CAGCGCATC 
■ ■ i u?00 

ASU II 

GCCT7C-ATCGCC7TC77GACGA37TC7TC7GAGC3GGACTC7GGGG7 r C5AAAT3ACCGACCAAGC3AC 
1 ■ . . , ii270 

GCCCAAC:73CCArCACGAGA"7CGA7TCCACC3CCGCCT"C"ATGAAAGGT7G33C77C3GAATC377 
■ . 1 , . uouo 

Nae I 

77C:GGGAC3CC3G3 7GGATGA':C7CCA3CGC333GA'CTCa7GC7GGA377C7':GCCCACCCCAAC: 

— 1 — — ■ «M 10 

7G TT"A"T3C AGC 77 A ~A AT GG~ ~ AC AAA TA AA3C AA TAGC A7C -C AAA T7TCAC--A7AAAGCATT7 T 7 

— ■ , LiZQO 

Bsm l Sal I 

"7cac*g:a:--: ta5""gt3G""3tcc--^c^:^tcaa tcatctt A-:^rG*C3'ATi:cGTC3^cc 

. — • *550 

"TAGCA.3AGC7T3GCGTAA-C ArSG'IAfAGC T3"7-::-GT3T5AAATTG""-:C':GCrCACAiT-C 
— — — ' i.520 



CACACAACAT-C3A3CCG3AAGC-ir-AA3T3'AA^3CC"33GGTGv:::AiT3AG":AGC:AAC7CACA-T 

— — — . . ■ , 1. L690 
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Pvu II 

AArTGCGTTGCGCTCAC"3:::3:TTTCCAGTC3G3AAAC:TGTC3TGCCA3C"-3CAT-AiTGAATC3GC 



CAACGCGCGGGGAGAGGCG 


3 3C3TATTGGGCGCTC7TCCGCTTCC"C3CTCACTGACTCGC""3C3CT 


CG3TCGTTC3GC T3CGGC3 


AGC337ATCAGC7CACTCAAA3GCGGTAA7 ACGG A TCC AC AGAATC AGG 


GGATAACGCAGGAAAGAAC 


a-G*3A3CAAAAGGCCAGCAAAAGGCCAGGaaCCG"aaaaAo3CCG:3T"G 


C7GGCGTTTTTCCATAGGZ 


:::g:ccccctgacgagcatcacaaaaa-cgacgc":aag"cagaggt3GC 


GAAACCCGACAGGACTAT- 


--GA 7 AC CAGGCG 77 7CCCCC TGG A AGC ~ZZZ TCG "3CGC ~CTCC ~37TCC 


GACCC~GCC3C7TACCGGA 


riC:73rCCoCC7*rTCTCCC77CGGGAAGC3r3GC3:7"-C7:AA73CTCA 


CGCrGTAGGTATCTCAG'T; 


ApaL 1 

:33~37AGGTCG77C3C7CCAAGCT3GGC'3TG"GCiCGAACCCCCC377C 


AGCCCGACC3CTGC3CC-T- 


^"•::33TAACTATCGTCT"GAG7CCAACCC337AAGACACGAC7-ATC3CC 


Alwn 1 

ACT3GCAGCAGCCAC "GGT: 


— CA33A77AGCAGAGCGAG3TA"GTAGGC3GTGC ~ -C A3A37 ~C~73AAG 


"GGTGGCCTAACTACGGC"^ 


CAC7AGAAGGACAGTAT77337A-CT3CG:TC*'GC 73 AAG CC AG 7 T AC C 7 


^cggaaaaagagttgg'ag: 


~C~ 73A7CCGGCAAACAAACCACC3CTGG TAGCGG TGG'^TT GTT7G 


CAAGCAGCAGA7TACGCGC- 


GAAAAAAAGGA7CTCAAGAAGA7CC7T*GATC 7: *AC33GG"C TGAC 


GC TCAG7G3AAC3AAAAC TZ 


BspH 1 

ACG 71 A AGGGA77 TTGG T C A TGAG AT T A~ CA AAA A GoA ~C TTCACCTAGA 


TCCTT-TAAATTAAAAA'G- 


AG~7TTAAATCAATCTAAAG7A**ATATGA3TAa^C TTGG7CTGACAGT-A 


CCAATGCTTAATC^GTGA33 


•:AC::ATCTCAGC3ATC*'G7:-A:TTrG*7:i7C :-T«G"T3CC'3AC"C 


cc:3*:GT3r-G-rAAC~i: 


3-"-C3GGA33GC T TACCATC "GGCCCC -3 FGC ' : aa~GAT iCC 3C3AG 


ACCCACGCTCACCGGCrc:^ 


G ^ " *TA r C AGCAATAAaCC AGCCAoCCGG AAGGG*" Z 3A jC 3CAGAA37GG 


Tc:rGCAACT"-"-T:C':-: :: 


:::ag r: ra r T iAT'G tt«":CC333 a-:,: t - :3 :mAGt^g*":g:c- 
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Fsp I 

G'UA7AGTT^:3CAAC3":-y3CCArTGC:ACAGGCArCG:3oTG' :ACGC'C3TCGTTrGGrATGG 

CTTCA77CAGC-CC3Gr-;;:AAC3ATCAAS6C3AGT-ACA-G ArCCCCCATGTr5TGCAAAAAAGC3G7 ~ 

— 1 ■ — ■ ' ■ 5*60 

Pvu I 

•"A3CTCC77CG5 7CCTCC3a-:g773TCA3aaG7AA G"" 3GCCGCAGTG77A7CAC7CA7337TA7G3CA 

— — ■ ■ — — - c2:d 

Sea I 

GCAC:GCArAArTCrcr-AC-:'CATGCCA rCC3TAAGArGC7-TTCrGTGACTG3TGAGTACTCAAr:CA 

1 1 1 — ' ■ 620C 

AbTCAr,CT3AGAATAGT37^G:3GCGACCGA GrTGC'Cr'GCCC3GCG7CAA7ACGGGATAATACCGC 

' 1 1 6270 

GC:ACA7A3CAgAAC7— "AiAG73C7CA7CA77G3 AAAACG:7C77CG3GGCGAAAAC7:7CAAGGArc 

— « ' ' 1 — — *- ' «■ auyo 

ApaLI 

^ACCGC737-AGATCCA3":3A7G7A^CCCA CTCG-3CACC:aaC73A7C7"CAGCA7C7777AC7- 

" ~[ ———————— ■ ££io 

CACCAGC37— C73GG"3-:-:aaAAACA33AAGGCAAAATGCCGCAAAAAAGGGAA7AA3GGCGACACG 

: 1 ^ 6550 

Ssp I BspH I 

GAAA"G7 73AA~^C7CA^»C ~- ~ 7CC 7~ 777C A ATA77A7TGAAGC A 777 A TC AG 33 77A T7G7C 7f A7fi 

6550 

G AAA AG 

— 6720 

Sal 1 Bgl II sal I 

"GC:aCCT3ACG7:3ACG3^-;:33AGA7C7CCC3A -:c::'A"337CGAC7C"C^G7ACAA7C7GCTC' 

— ' ■ ' — — — ■ - 5790 

Alwn I 

GA7GCCGCA7AG77AAGCCAG-A7C7GC7::C"3C7 -G'37GT7SGAGG7CGCTGAG7AG73CGCGAGCA 

1 ' 1 — *- ■ 6660 

AAAT. AAGC '^^aacAmG3-:aa3GC7 t 3ACC-3ACaat-gca-GAA3AA7C7GC'7AGGG77AGGC377' 

Nru ' Mlul Spel 

"GC3C-GC77Ct3:3A7>3-AC:..^;cAGATA YAC3CG-5ACA-3A7-A773AC"A37TATTAA7AS7AA 

— ~ ' — - 7C0O 

i-A AC^..A - Gr '^ T "3CC:ArA-ATGG^G77CCJC3r-ACArAACT7ACuGTAAATS^rr 



AGC3GA7ACA- 777GAA73'4-T7AGAAAAA-±AAC^AArMGGGGT-CC3CGCACAT;--CC> 



!GCC7GGC7 HCZoZCZ • 



. :ijcg:.:;a" T3^c-: "c-a'^atgacgta 7n ~ "7cc - tag7aacgcc 



^^GGGac?— ;A--GACc-AArGG3T3GAC7A---ACGG-AAJic-3CCCACrrr= P rAr,r A r. 



7CAA 

~:io 



Nde I 

GT3TA:CArA-G::AAG"^:GC:::C7ATT3ACGrCAA-GACGG TAAArGGCCCGCC7GGCA7TA7GCCC 

' 1 1 — — ? "80 

-GTACATGACC 
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SstI 

-ATAAGCAGAGCTCTCTGGC-AACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGACTCAC 70 

SstI 

.Ppal Hind III j BamHISpel Xma III Ecofl I 

■ i j j j 

TATAGGGAGACCCAAGCrT3G*ACC3AGCTCGGATCCACTAGTAACGGCCGCCAGT3TGCT5GAATTCTG 140 

R P P V C V x S 

Afl III 

pgi H wiu i 

CAGA7C7TGGC7ATCAAA7TGG73AAC77CGACGC37CA7TGGAGAC 7CCACAACCA7GA TAACCA3CCA 210 
A 0 L G / Q • 3 £ L R R 7 IGOSTTM TTSH 

M : T S H 

7CCAACT3ACA7TCTTACTTCC^CAACTACAATCC3AATGTTCATGCACGGTGCCG:aCAGAG7CGC3TA 280 
Pr DlLTS5T7[3.vi.- MHGA<:0$fi7 

GACAG7C TGG7CCTTGA7ATGCT7CT7CCAAAGCAAA7GATTC7CCAAC TCGTCAAG7CAAT7T7GACAG 350 
OSLV ta OM.L=>KOM : L 0 L V < S : I. 7 
OSlV.OM.LBKQM I L Q L V K $ I " * 

Bbv II 

j 

AGAGACG7C7GG73T7AGC7GGA3CAACTGGAATT3GAAAGAGCAAAC73GCGAAGACCCT3GC7GCT7A 420 
eR3L7L A3A7GiGKSK.LAKTLAAY 
ER9L7LA3ATGI3KSKLAKT' A A Y 

Asu » Bsm I 

: : 

7GTATC7ATTCGAACAAATCAA:c:3AAGATAGTATTGrTAA7A7CAGCAT7CC7GAAAACAA7AAAGAA U90 
v SIPTN4 C cr 0s:vN:s(pENNJK£ 

y S IRTMCS£DS:VN!S1P£NMK£ 

Ava 111 

; Xmn 1 Bgl II Nsi | xba I 

• i j 

GAAT7GCTTCAA3TGGAACGACGCCTGGAAAAGATC7TGAGAAGCAAAGAArCA7GCA7CGTAA7TCTAG 560 
£L - 0v E R *L-KlLRSK£SC:V!t 

el -Q v er*l£kilrsiccSCivil 

A7AdrA7:CCAAAGAArCGAAT'3rA7r73r7GTArCCGT7'7'3CAAATGrCCCAC77CAAAACAACGA 630 
C N t P N P ; a - V V S V A N V P . 0 V N E 

D N I P • K R ; a r v v s v r A *J V P . 0 N N £ 

EcoR V 

AG37CCATT7GTAG7A7GCACA37CAACC3A7ATCAAATCCC7uAGC7rCAAArT:-.CCACAA7TTCAAA 700 
3 P F V '•/ C * v N P Y C I P E L 0 I H H * F < 
G P F V V : - V *l R Y C / ^ £ *. 0 f H H *l r 
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ATGTCAGTAATGTC3AATC3TCTCGAAGGATTCATCCTAC3TTACCTCCGACGACGGGCGGTA6AGGATG 770 
MSVMSNR .EGF ILRYLRRRAVED 
MSVMSNR^EGF ILRYLRRRAVEO 

Sstl 

I 

AGTATCGTCTAACTG 7ACAGA7GCCA7C AGAGCTCTTCAAAATC ATTGACTTCTTCCCAA FAGCTCTTCA 8«0 
EYRLTVC^PSELFKI I 0 F F P ! A u C 
EYRLTVC^PSELFICI I 0 F F P I A L 0 

EcoR I 

! 

GGCCGTCAATAATTTTATT3AGAAAACGAATTCTGTTGATGTGACAGTTGGTCCAAGA6CATGCTTGAAC 910 
A V N N F ItKTNSV OVTVGPRACLN 
A V N N F ic< T NS\/3v'TVGPRACLN 
Bam HI 

7G TCC7CTAACTGTCGAT3GA"CCC3TGAATGGrTCATTCGATTGTGGAATGAGAACT r CATTCCATATT 980 
CP *. 7"VC3 $REy r :J?LVNE^rIPY 
CPL Tv/DGSREVF ! RLVNENF |»Y 

Afl Ml Bam HI Tth I 

' I i 

TGGAACGTGTTGCTAGAGATGGCAAAAAAACCTTC5GTC3CTGCACTTCCTTCGAGGATCCCACCGACAT 1050 

LERVARD3KK TFGPCTSFEOPTO ! 
lE9VARC3K<7fGRC7S. r EDPTDI 

Bbv If 

CG TC TC 7 A AA AA A TGGCC 3 TGG 7TCG A7GGTGAAAACCC 33 AG A A TGTGCTCAAACG7C 7 TC A AC TCCAA 1 120 
VSKKVPwFDGENPENVLKRLQLO 
VSKKVPwFOGENPENVLKR LOtO 
Tth I XhO I 

gacctcg tcccgtcacc7gccaactcatcccgacaacac ttcaatcccctc3ag7cg 77gatccaattgc 1 190 
olvpspanssrqhfnplesl i q l 
olvpspanssrqhfnplesl iol 

Bbv II 

ATGC7ACCAAGCATCAGACCA7CGACAACAT773AACAGAAGACTCTAATCTTC7CTCGCCTC7CCCCCG 1260 

HATKHOTIDNi 

HATK HOr:DNI 

C7TTCC"TArC7T:3'ACC3G"ACC;GAToAT7:c:CA7TT T CC:CCT7TT:CCCCCAATTTCCCAGAAC *3jC 

Xma I 
Sma I 

CTCCTG77CCC77T3T7CCTAG7C::CCC33G7GCC3AC3CCGAAGCGATTrAAAAACC7TT T 7CTTTCC T40C 
Xmn I 



GAAACAT7 TCCCATTGC7CAT 7AATAG TC AAAT T3AATAAACAGTGT A ToT ACT 7 AAAA AAA AAA AAAAA 1U70 
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Sst I Be! I 

AAAAAAAAAAAAGGGCCC7A7-C7A7AGTGTCACC7AAATGC7AGAGC7CGC7GATCAGCCTCGACTG7G JS^C 
CCTTCTAGTTGCCAGCCATC-GTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTC 

CCACrGrCCT*"CCrAAriiAArGAGGAAArTGCArCGCAT7GTCT3AGTAGGTGTCATrCTATTCTGGG 166C 

Bbv t! 

GGGTGGGGTGG'33CAGGACAGCAAGGGGGAGGArTGGGAAGACAATAGCAGGCATGCTGGGGATGC3G7G 175C 

GGCTC7ATG3C"7CTGAG3:GGAAAGAAC:aGCTGGGGCTC7AGGGGGTATCCCCACGCGCCCTGTAGC6 162C 

GCGCA7TAA3CGC3GCGG3 7G7GG7GGTTACGCGCAGCG7GACCGC7ACACT7GCCAGCGCCC7AGCGCC 169C 

CGCTCC7TTCGC777C— ::C77CCT77CTCGCCACG77CGCCGGCT77CCCCG7CAAGC7CTAAATCGG 196C 

GGCA7CCC77~AGG37"C:dA7T7AG7GCT77ACGGCACCTCGACCCCAAAAAAC7TGA7TAGGGTGATG 203C 
Dra III 

G77CACG7A3~333:CA*C3CCC7GArAGACGG77T7-C3CCC777GAC3T7GGAGTCCACGT7C7T7AA 2XC 

'AG7GGAC7:"3T7CCAi^C"G3AACAACAC7CAACCC7A7C"C337CTAT7C77T7GA77TATAAGGG 2!7C 

Xmn I 

AT77"GGGGA7-7C3GCC"A"-GG77AAAAAArGAGC7GA777AACAAAAAr77AACGCGAA77AATTC7 22*0 

G 7GG AA7G7 3 "3 TC AG 7"" -3GG 73 7GGAAA37CCCCAGGC7CCCC AG GCAGGC AG AAG7ATGCAAAGCA7 231C 

Ava III 
.Nsi I 

GCATC ~C AAT " A3TC AGC -ACCA3GTG T 33AAAGTCCCC AGGCCCCCA3CAGGC AGAAG TA7GC AAAGC 226C 
Ava III 
Nsi I 

AT3CA7C7:aa*"TAG7CA3CAACCATAGTCCCGCCCC7AAC7CCGCCCATCCCGCCCC7AAC7CCGCCCA 2<i*C 
GT7CCGCC:i— C7CCGC::CAr3GC7GACTAATTT77T7TA7T7A7GCAGAGGCC3AGGCCGCCTC7GC 252C 

Xma I 

Stu I | Sma I 

i ! 1 

C7CTGAGC7^"*C:aGAAGTAG7GAGGAG3C77TT77GGAGGCCTAGGCTT77GCAAAAAGC7CCCGGGA 2590 

Bel I 

GCTTG7ATATC:.iTT7'C33ArC73A7CAA5AGACAGGATGAGGATCG"TTCGCATGA77GAACAAGA7G 2660 
Xma III 

GATTGC AC 3C AG3 T 7C "C C3GCC3C 77G 33TGG AGAGGC TA77CGGC "A TGAC TGGGC AC AACAG AC AAT 27j0 

Nar I 
Bbe I 

cg3C"gct;::at3::gc:3"G"tccgg:73tcagcgca53ggc3:ccg3 7tc""7T7G7:aagacc3ac Zc?c 

CT3TCC-3GT3:;:r3AA"3^AC"3CAGGiCGAGGCAGCGC3GCTATCG"33C'GGCCACGACGGGCGT7C 2S7C 
Fsp I Tth I 

CTT3CGCAGC"3T3C 7C 3-CGT7GTC AC T3-AGCG3GAAGGGAC T33C *3C TATTGGGCGAAGTGCCGGG 22*0 

gc-gga:ct:.:'3t:a"c tcaccttgctcctgccgagaaagtatcca-catggctgatgcaatgcggcgg jo:c 
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"3ATCCG3C7ACCTGCCCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTA 3080 

AGCCGGTCTTG7CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGA 3J50 
BssH II 

A CTG77CGCCAG3:TCAAGGCGC3CA7GCCCGACGGCGA33A7C7CG7CG7GACCCA7GGCGA7GCCTGC 3220 



CTGCATACGC 
CTCGGA7GGAAG 

BssH II 



Rsr II 



I;°^- A *I 4 " CAT3CTG3 - A ^ TGGCC5CT77TCT6GATT CATCGACrGTGGCC3GCTGGG7GTG5CGG 3?S0 
AC . C ' C ; A ; C ; G ! ACATAGCG ^ 3360 
<- rTCC . i.GTGl . Tj CGGTa .CGCCGCTCCCGAT7CGCAGCGCATCGCCTTC7ArCGCC7TC776ACGAG 3«30 

Asu II 

lllllrrlrr-^^^ 35CO 
rr": rr rr^-^ 3570 
^Cu^GCoCoG^ATCTCAToc.SSAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAArGGTT 36U0 

Bsm I 

ACAAA7AAAGCAATAGCATCACAAArTTCACAAArAAAGCATTTTTTTCAC7 - GCATrC7AGTT'G7G6rT7 3710 

Sna I 

^ ccaaac i ca : caa ; g :^ ^eo 

^tllr-rtt .-'~lrr£ ' ^^~A7T377A7CC 3C7CACAAr7CCACACAACATdCGAGCCGGAAGCA 3650 
.AAAG.G7A 6 ^.„TGGGG7GCCTAATGAG7GA5CTAAC7CACAT7AA7TGCG7T3CGCrCAC-GCCCGC 3920 

Sbo I 

llllit c lif cc :t:ti c ^ 3sso 

CG7A,.G«oCCC i .7.CCGC77C:rCGC7CACTGACTCGC7GCGC7CGG7CG77C5GC7GCGGCGAGCGG *C60 

AH III 

rccrr-rlrr CA !;; AAAGGCCAGGAACCGTAA " AGGGGG CGTTGC7GGCG777T7CCATAGGCTCCGC U2C0 

Irrlrrrrrr^^^ "270 
r ^ AGG " r T T r L .;;;5: GGAAGCTCCCrCGrGC5:rCTG ^ G "-GACCC7GCCGCT7ACCGGATACCr «3«0 

G.CCG.CTTTC C u 3GAAGCG7GGCGC77TCTCAATGCrCACGCTGTAGG7ATC7CAG7TC3GTG «U!0 

ApaLI 

-AGG:CGTTCGCT;CAAGC7GGGCTG7G7GCAC3AACCCCCCGrTCAGCCCGACC3C7GCGCC-7ATCCG "«80 

Alwn I 

:" AA ^; 3 ^7 3AG7CCAAC::GG " A ^ 

moAAG.^a. A . . 7GG : A TC . GC3C TC TGCTG A AGCCAGT ~ACC TTCGGAAAAAGAGT7GG 7AGCTC 77 *cS0 
llllc> : rrr-'T- ACCA ;;^; :5GTAGC3GTGGTTTT " TTG "TSCAAGCAGCAGAT-ACGCGCAGAAA C76C 
AAAAGv.-M7Cr,.,oAM U ArCc • r3ATCTTr7C7ACGGG3rc-GACGCrCAG7GGAACGAAAAC::ACGr y 8 3C 
BspH I 

-AAGG:ATTT-G3T:A7GAGA7-ATCAAAAAGGA7C77CACC-AGATCCT7T7AAAT'AAAAA-GAAGr7 ugoo 
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r7AAA7CAATC7AAAGTArATA-3AGTAAAC7TGGTCrGACAGrTACCAArGCrTAArCAGTGAGGCACC 4970 
TATCTCAGCGA7CT3TC7ATT-C3TTCAT:CATAGT7GCCTGACTCCCCGTCGTGTAGATAACTACGATA 50U0 

Ppa I 

C6GG AGGGC 77ACCA 7C TGSCCCC AG 7G C 73C AATG A7ACCGCG AGACC CAC GC T"C ACCGGC 7CCAGA 7T 5110 
-ATCAGCAATAAACCAGCCAGCC33AAG33CCGA3CGCAGAAGT6GTCCTGCAACTTTATCCGCCTCCAT 5180 

fsp I 

CCA3 7C7ATTAA TTG77GCCGGGAAGCA3AGTAAG7AG 77CGCCAG77AATAG777GCGCAACG77G ~7 5250 
GCCA7TGC7ACA3GCA7C37GG737CAC3:7CG7C3777337A7G5C77CATTCAGC7CCGG77CC:aaC 5320 

Pvu I 

GATCAAGGCGAG7TACA7GA7CC:CCA737TGTGCAAAAAAGCGG77AGC7CCTTCGG7CC7CCGA7CG7 5390 
7GTCAGAAG7AAG77GGCCGCAG7G77ATCAC7CArGG77A7GGCAGCACTGCATAA77CTC77AC757C 5U60 

Seal Sbol 

A7GCCA7CCG7AAGA7GCT7Trc737GAC73G7GAG7AC7CAACCAAG7CA7TC73AGAA7AG7G7A7GC 5530 
GGC3ACC3AG773:7C7 T GCCCc3:G7C^irACGGGA7AA7ACC3CGCCACA7AGCAGAAC777AAAAG7 5600 
Xmn I 

GCT:ArCATTGGAAAACGT7C";35GG:3AAAACTCrCAAGGA7C7TACC5C7GTTGAGA7CCAGTTr G 5670 
EcoK ApaL I 

A7GTAACCCACTC3TGCACCCAAC7GA7Cr7CAGCA7C7777AC777CACCAGCG777C7G3G7GAGCAA 57Q0 
AAAC AGG AAGGC AAAA7GCCGCA-AAAAG33AA TAAGGGC3ACACGG AAA7G77GAA7AC 7C A7AC 7C77 5810 
Ss P 1 BspH I 

CCT7T7-CAA7AT7A77GAAGCA777A7CiGGG7TAT7G7CTCA7GAGCGGATACATA77TGAA7G7A77 5380 
»5AAaaaTAAACAAA7AGGGG77CCG:3:aCATT7CCC:3AAAAG7GCCACC7GACG7C3ACGGATCGG 5950 
* Alwn I 

GAGA7C7CCCGATCCCC7A7GG-;3AC7C;CAG7ACAA7C7GC7:7GA7GCCGCATAG77AAGCCAG7a7 6020 
^ : ^«CC:TGC.73TG7GTTGGA3oTCGC73AGTAG7GCGCGAGCAAAATTTAAGCTACAACAAGGCAAG 6090 

Nrul 

GC77GACC3ACAA77GCA7GAAGAA7C73;77AGGG77AGGCG7T77GCGC7GCrTCGCGA7G7ACGGGC 6160 
Afl III 

Mlu I Spe I 

CA3A7A7AC3CGT73ACATTGA"TA77G-:7AG77A-AA7.AG--ArCAA77ACGGGG7CAT7AG77CA7 6230 
A Gu ; .CA.ATA:GGAG7:CCGCG-7ACA7AAC77ACGG7AAA7GGC::GCCTGGC7GACCGC::AACGACC 6300 
C^.oCCCA7.GAC37CAATAA7GACG7AT3 7-CCCA7AGT^ACGCCAArAGGGAC777CCAT7GAC37CA 6370 

Nde I 

AT33G*G3AC7AT77ACG3TAAA:T3CC:i:77G3CAG7ACA7CAAG:G7ArCA7ATGCCAAG7AC3:CC qMO 
CcTA. ^C5-CAirGAC35-AAAT3GC::3CCTG3CA:7A-GC::AGrACA7GACC7-ar3GGACT7-C 5510 

SnaB I 

C7ACT7GGCAG7ACA7C7ACG7AT7AG7CATCGC7A77ACCATGG7GA7GCGG7TTTGGCAG7ACATCAA 6580 
rGGGCG75GA7AGC3G77TGAC7CACGGG3A7TTCCAAG7C7CCACCCCAT7GACGTCAATGGGAG777G 6650 
777TGGC ACC AAAATCAACGGGAC 1 77CCAAAA TG7CG7AAC AAC 7CCGCCCCA7 7G ACGCAAA7GGGCG 6720 
G7AGGCG7GTACGGTGGGAGGTCTA 67^5 
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CA G TT AA T AG -r S CGCA A C ( :-3rTGCCATT G crACAGGCATCGrGGTGTCACGCTCGTC G rrT„ TAT 




Sea I 

CAGCACTjCATAArTCTCTTAC'GTCATGCCATCCGTAA GATGCTTTrCTGTGAC TGGTGAGTArTCAar 
CAAGrCArrcrGAGAATAGTG-ATGCGGCGACCGAG TTGCTCTTGCCCGGCGTCAATACGGGATAArACC 

GCGccACArAGCAGAAcrTtA^^rccrc^c^^^^^^^^^^^^^^^^^ 



:«5o 



-20 




OS 

CGGAAATGTTGAATACTCAT ACyCTTCCrTTTTCAATA TrATTGAAGCATTTATCAGGGTTATTfiTrTrfl 
rGAGCGGATACATATrrGAArGrArTrAGAAAAATAAA CAAArAGGGGrTCCGCGCACATTTCCCr^^ 

; Sa ' 1 ,Bfl« " Sail 

g orGccAccrGAC3TCGACGGA-CGGGAGArcrcccGArccccrArGGrcGAcrrTr„ T . rA . r , T ., T 

Alwn I 

5J^ A ~^ ^^ A ^~**^^ A GCCAG TATC ^SC^CCC TGCTTGTGTG TTGGAGGTCGC Jr^n tag -rrnrr. \r 



7CO 
770 




NfUf Mfcil Spej 

T^CrGC — SCGArsrACG33CCAG A rAr & C S CGTT6ACArTGATTAT7GACTA 6 rT A TT..r,.- 



910 
$80 



'050 



i 3 20 



WO 98/24810 
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CCCGCCTS6C7GACCGCCCA^CGACCCCCSCCCArT6ACGTCAATAATGAC6TATGTTCCCATAGTAAC6 
CCAATAGGGAC~TTCCATT3^CGTCAATGGo~GGACTATTTACGGrAAAC TGCCCACT"GGCAG~AC 8 T C ' ^ 

p*l " ' ' ' ~ — 1260 

AAGTG-ATCATATGCCAAS7^C6CCCCCTATTGACGTCAArGA C 33 TAAA TGGCC C3CCT3GC A - TATGC 

' ' — 1 1 — - — ■ ■ • 220 

SnaB I Nco I 

CCASTACArGACCTTA7Gu5.AC-TTCC"ACrTGGCAGTACATCTAC G 7A TTAG7C ATCGC TA7~ACCA TG 

*- 1 " — . — • a 00 

GTGA-GCGGTTTTGGCAGrACA-CAArGGoCGTGGATAGCGGTTT GACTCACGGGGAr-CCAAGTCTCC 

1 ' 1 1 • \x 7Q 

ACCCCATTGACGTCAATGGGAG-77GT7TTGGCACCAAAArCAAC GGGACTTrcCAAAA7GTCG-AACAA 
_ 1 ■ : c 

C7C:GCCCCA77GACGCAA.i-GGCGG-AGGCG7G7ACGG7GGGA GG 7C TA 7 A T A AG C AG AG C 'Z TC TGG 

' l — ' — - ? 6 10 

: Hind III 

C7AAC7AGAGAACCCAC7GC-AC7GGC77A7CGAAA77AA7ACGA CCAC7ATAGGGAGACCCAAGCT7 

. ' ' ' — :680 

Asp 718 8am HI Bstx , L 

K P nl s Pe' Xmalll EcoR I Pst I 

GG7ACCGAGCrC5GA7CCAC^G7AACG3C:GC:AG7G7GC7GGAAr- CTGCAGATCGCCGC-GC-Arr" 
r - r I I ' ~ ~~ ' — 1 1750 

G £L5 -" Sw 3r ocag: _oiaaa- 

Ava I 
Xma I 

l Sma 1 Pvu II 

CAACCr7CGGA CA A CAT7 C GC'AAGArcCCCGGGATAC7CA7CCTAT7CTCCACAC77ATCAG-GTCAGC 

S ' " G 0 H S - » S P G Y S S Y s p H L s iy s A 

Spe I 

Sal I 

7GATAAGGACAC AATGTC Ta "*GC AC 7CACAGAC TAG 7CGACGACC TTCT TCACAAAAACCAAGC "ATTQA 

0 " C T M 5 ^ 3 0 T S * R P S 3 0 < P * < S 
v< - K ' u S 0 " S R p ? c s - ? , = 
£GCC AA "T TC A7 T C AC ~~3~ ~CG 7~A A 'GCCACC ^7C AAqAG T~C AC A^ CC ACCG AGCAC AGAA "33CGG 
- o - ^ . , " I ' 

• ' H * L — K : h . c £ - - a r £ H R 4 



!820 



3 ' :Fh slC'k:f_oe 



S " £ h P v A 
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: Ava 1 Bam HI 

: i 

CTCTCT-3AGCCC5AGACGGGTGCC3AACTCGAT3TCGAAA TATGATTCTTCAGGATCCTACTCGGC3CG 

' i ■ i , ■ i . 2030 

AL - S=> *3VPNSM$KYDSSGSVSAB 

AL - S3R *\'PNS*$KYCS$GSYSAR 
Ava I 

"rCCCGAGGTGGAAGC"CTAC"G3rATCTATGGA3AGAC3TTCCAACTGCACAGACTArcC3A7GAAAAA 

' — ' — ^ ' 1 2-00 

SRG3 5S'GIYG£TFQLHPLS0E< 
SRG5 SS"GI YSETFQ L HRLSOE< 

Bam HI Nde i 

I i 

"CCCCCG:ACATTCTGCC^AAAGr3AGATG GGATCCCAACTArCACTGGCTAGCACGACAGCA7AT33AT 

' lii I . . ,| . . | p • 

S?AH S AKS£>-6S0LSLAST-AYG 

3?ah s-ksemgsclslast:ayg 

Sal I 

C7CTCAArGAGAAGTACGAACAT3CTATT CGGGACA7GGCACGTGACTTGGAGTGTTACAAGAACACTGT 

" ' ' 1 22 UO 

SLNE<V £HAIR0.VAPDLECYICNTV 
5 L %J E < v £ H A I R 0 m A P C I E C Y K N 7 v 

Hind III 

CCAC"CACTAACCAAG^AACAGGA3AAC TA7GGAGCA77GTTTG ATC^7 TTTGAGCAAAAGC ~7AGAAAA 

— 1 1 ' ' 22io 

D 5 L 7 K < * E N Y 3 A L F 0 L F £ 3 K L R < 
0 5 L T K < G E N Y G A L F 0 ' F^3< L R< 
Cla I 

C7CAC^CAACACATTGA-CGA7CCAAC7TGAAGC CTGAAGAGGCAATACGATTCAGGCAGGACA7T3C7C 
— ' 1 ' 2380 

L70H ^CRSNUKPEEAIRFRQ0lA 

-"° H, '- R SNLKPEEAi^FPOOlA 



H L ^ C 


I s 


■'^LASNSAHANE 


G 


- G £ . L 


H L ^ o 


I s 


: < * I A S M S A H A X £ 


r. 


A G E , L 






Cla 1 Cla I 






~C37C -AC C A • 


'C7:tg-: 


A^TCAGTTGCATCCC ^TC3A7CATCGATGTCATC3 


"rr. 


tcgaaaagcagcaag 


5 Z p 


3 L 


*- ? v A 5 M R S S M S i 


'3 


3 k $ l < 


^ p 


S L 


r V A 5 H P $ S M S c 


c 





?«50 
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Bam HI 

CAGGAGAAGATCAGCTTGAG CTC3TT7G5CAAGAACAAGAAGAGCTGGATCCGCTCCTCACTCTCCAAGT 

1 ' 1 , 2590 

3E<:SL35FGKNKlCSVfRS5 L si C 

0 E < I SL S5F5KNKXSVIRSSLSK 

.Nde I £spM II 

"CACCAAGAAGAAqAACAA q AAC TACGAC3AAGCAC ATA T3CCATCAATTTCCGG ATC *C AAGGAACTC T 

: — — ' * 1 1 — 2660 

. FT <K:<NKM V 0£AHMP3 I SGSGGTL 
F T <K<NIC*iYDEAHMPSlSGSCGTL 

ApaLI 

*G AC AAC A TTG A FGTG AT T 3 AG " To A AGC AA G AGC TC AA AG AACGC3 AT AG TGC ACT AC GAAG7CCGC 
— « 2730 

3 N I 0 V CELKQELKERDSAL v £ V 3 
3 N I 3 V I E L KCSLSCERDSAlYEVR 
C TTG AC AA TC 'GGATCG TGCCC 3 CG AAG T TG A 7G TTCTGAqGGAGAC AG TSAACAAGT'GAAAACCGAG A 

' 2800 

'-D.NL3RA3EVCVLRE"7NKLKrE 
-0NLDRA5EVCVLRETVN<LKTE 

acaagcaattaaagaaag aagtggacaaactcaccaacggtccagccactcgtgcttct^cccgcgcctc 

~~ 1 1 1 — — ' 2870 

, ^K0L<KE7O.<LTN£?A7RaSS3aS 
* < 3 L < K E V 0 < l T N G => A T R a $ = =1 A S 
AA TTCCAGTTA^CTA CGACG A"*GAGCATGTC 7 A fGATGC A3CGTGTAGC AG TACATC AGC TAGTC AATCT 

' ' ■ 29^0 

1 P V I Y 0^^HvrCAACSSTSASCS 
I P V I y 0CcHV Y0AACSSTSASCS 

Asu II 

TCGAAAC3ATCCTCTGGCT 3CAACTCAA TC A AG3 T T AC TG T AA ACGTGG AC A TCGCTGG AG AAATC AG TT 

— ' ' ' ■ = 30 10 

S K 3 .5 S G C *l S I K V r 7 N V 3 I A G E IS 
S K 3 5 S G C N 5 I X V T -/ N y } » AGE I 5 
Pvu I 

Hpa I EcoR V 

CGATCGTT AACCC33ACAAAGAGATAATC3TAG3ATATCTTGCCATGTCAACCAGTCAG'CA"GCT33AA 

- ■ ! ; ' 1 — ■ 3080 

3 : / U - 0 K C : I V 3 Y L A M $ T $ J f C 

3 : v ' : * c k •: : i v 3 y i a m $ t 3 ) 5 : * < 

AGACATT3ATG:TTC7A7T:-iG3AC7ATTTC-AAGrCTAC::ATC:AGAArTGA73TGGAGCATCAAC-' 



# 
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; C,al Mlu I 

I i 

SeAATCGATSC7CGTGATTe-AreCTrG6CTATCAAATT68T6AACTTCG ACGCGTCATTG6A6ACTCCA 

11 1 ""220 

3 '' 0as ° s:lgvoigeurr V | GOS 
a * oard s ;lg vq i gelrrv ;gos 

CAACCA7GATAACCAGCCArcCAACTGACArrCTTACTTCCTCAAC TACAATCCGAATG7rCATGCACG6 

' ' ' ' ' ' "290 

1 - " j I f h * T 3 1 - T S S - T I R n F „ H 6 
' " a *""~Oll.TSSTT IRMFMHG 

.GV.UGCACAGAGTC3CGTAGACA3TC7GGTCCTTGATATGCTTCTTCCAAAGCAAATGATTCT CCAACTC 

AAOSR\/C3_VLDMLLPK0M I L 0 3360 
* * 0 . S " V S 3 V L 0 M L P K 0 M 2 L 0 I 

G^AG.v.AA. TTGACAGACAGACG-CTGG7GTTAGC7GGAGCAACTGGAATTG5AAA GAGCAAACT6G 

i < I ^ - L r - * I L V " * G A 7 G : 3 K 5 * L 
VKS -- T t?RLVLAGA7GI3K S .<L 

s Asu " : Bsm I 

CGAAGACCeTGGCTGCTTATG-ATCTATTCGAACAAATCAATCC GAAGATAGTATTGT7AATATCAGCA7 

, „ _ ~~~ ' ' 3500 

'SIRTNOSEDSrvNIS' 

A K T «• - A • • s i r r n o s e o s : v n i s j 

Bgl II 

rCCrGAAAACAATAAAGAAGAATT3C--CAAGTGGAACGACGCCTGGAAAA3A7CTTGAGAAGCAAAG AA 

? E N N K £ E w - 0 V E R R L E K I L R S K * ^ 
P Ava III ^ E E L ^ ° V E « " ^ < ■ ^ « S ^ E 
Nsi I xba I 

TCATGCATCG7AATTCTAGATAATArcCCAAAGAATCGAAT7GC ArTTGTTGTArCCG7TTTTGCAAATG 

11 J V ' L C >J •' P K « ' * F V V s V F A N 

'^E*<2?KNR IAFvvSVFAN 

EcoR V 

TCCrACTTCAAAACAACGA^GG-CCA-rTGTAGrATGCACAGTCA/lCCC-ATATCAAATCCCTGAGCTTCA 

V P *- 0 N N E : " p z X V C 7 N , Y r , p E L p 

V " - C N E 3 p V V C 7 v U R Y | p p i ; 
AA T7i ACC AC AA T T TCAAAA *G ~C-5 ~AA T3 7CG&A TCG TC TCG AAGGA TTC ATC C TACGTT ACC TCCG A 

1 h h F ''• M 3 V M S N R I. E G F I • R Y L S 

1 H H H F < i V M 5 N R L E G F I I R Y L R 
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CG AC GGGCGG 7AGAGG AT3AG 7A TCG 7C TAAC TG TAC AG A TGCC A TC AG AGCTCTTC AAA ATC AT TG ACT 
1 ' ■ 2850 

^RAVEDEYRlTVCMPS£L"K:| 10 

^RAV£D£YRLTVCMPSEL r KI ID 

EcoR I 

i 

~C TTCCCAATAGCTCTTCA3GCC3TCAATAATTTTA77G AGAAAACGAATTCTGTTGATGTGACAGTTGG 

' 1 1 ' ' 1 ' 1 ' 3920 

F F ? I A L C A V N N F I E K T M S V D V T V G 
F F ? \ A L £ A y N N F I E K T N S 7 3 V T •/ 5 

: Bam HI 

"CCAAGAGCATGCTTGAACTGTCCTCTAACTGTC3ATGGATCCCGTGAATGGTTCATTCGATTGTGGAAT 
■ « . i 1 1 . ■ 2990 

*RACLN-:?L T VD3SREV- I Q l V N 

3 a a : l n c ? u r v c 3 s r e v .- i a u v n 

GAGAACTTCATTCCATArTTGGAACGTGTTGCTAGAGAyGGCAAAAAAACCTTCGoTCGCrGCACTTCC* 
— 1 ' uC6C- 

E N F I 3 *LERVA3DGKKTF3RCTS 

ENF : ? V LERVARDGKKTF3RCTS 

Bam HI Tth t 

7 C3AGGATCCCACC3ACATCGTCTCTAAAAAA7GGCCGT3GTTC3ATGGTGAAAACCCGGAGAATGTGCr 
^ — ■ ■ ' «1X 

F E 0 P T D V 3 < K * - V F G 5 E "*J - E N 7 L 

F E 3 P T C 7 5 < K V ^ V ? D G E M 5 E N V L 

Ava I 

Tth I Xho I 

caaacg"cttcaactccajigacctc3^cccgtcacc:gccaactcatcccgacaacac""caatccc:tc 

' U20C 

<RL3L0CLVPSPA.N3SR0HFNPL 
<RLJL3DLVPSPA.N5SRQHF.NPL 

GAGTCGTTGA TCCAA FTGC ATGC TACCAAGCATCAGACC ATCGACAACATTTGAAC AGAAGAC7C TAATC 

« - i . . , U27: 

esl;olhat:< H0TION1. 

ESL:OLHAT<HO r ICNI. 

Asp 718 
Kpn I 

"TC TC"C3C: "CT::CCC3;'~~::TTATCTTCGTACCGGTACCT3A7GArTrcc:A:TTTCCCCCTT"- 
' — s . . , 

Ava I 
Xma I 
Sma I 

OZ CCCCAAfT'CCCAG AA ~CC T3r7CCCT77STTCCTAG T CC TCCCGoGTGCC3ACGC:3AAGC3-"7 

— — ' ' ■ . — — - • xum: 
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7A AA AACC TTTTTCTT7C Co AAAC A TTTCCC AT TGC TCATTAATAG TCAAA T TG A A T AAACAG TG TATG T 

' J ' ' ' 1 ' — 1 — - «^60 

Ora II 
Ora II 
I P ss I 
i Apa I 
Pss l 

ACrrAAAAAAAAAAAAAAAAAAAAAAAAAAAGG oCCCrArTCTArAGTGTCACCTAAATGCTAGAGCTCG 

~— ' ' ' " ' ' ' -650 

Bel I 

I 

CTGATCAGCCTCGAC7GTqCC"^:TAGTTGCCAGCCA TCT3TTGTTTGCCCCTCCCCCGTGCCTTCCTTG 

' 1 1 1 ■ «620 

AC CC TGGAAGG TGCC AC 73 CC AC T3TCC TTTCC TAA TAA AA TG AGGA AA T TGC A TCGC A TTGTCTG AG 7A 

" ~ J ' ' - 1 ' 1 1 1 UQQO 

GGTGTCATTC'ATTCTGG3GGG-33GGT3GoGCAGGA CAGCAAGGGGGAGGATTGGGAAGACAATAGCAG 

— ' 1 ' ■ — „ a760 

Pvu II 

GCATGCTGGGGATGCGGTGGGC'CTATGGCT-CTGAGG CGGAAAGAACCAGCTGGGGCTCTAGGGGGTAT 

' " ' J 1 — 1 — U830 

CCCCACGCGCCCT3TAGC3GCGCATTAAGCGCGGC GGGTGTGGTGGTTACGCGCAGCG7GACCGCTACAC 

" - ' ' ' — «- 0900 

Nae I 

'TGCCAGCGCCCTAGCGC3CGC-CT7TCGCT7TCTTC CCTTCCTTTCTCGCCACGTTCGCCGGCTTTCC 

1 1 — ' 1 — ■ — 0970 

~C 3 TC AAGCTC ~ AA A TCG 33 GC A TCCCT TTAGG 3TT CCG A TTTAG TGC T TTACGG C ACC TCGACCCC AAA 

~ ' ' ' 1 J — — ■ 50 U0 

Ora lit 

AAACTTGATTAGGGrGA7GG7'CACGTAGTGGGCCATCGC CCTGATAGACGGT7TTTCGCCCT77GACGT 

' 1 — ' 1 1 ■ — ■ i 51 10 

■G6AGTCCACG7TCTTTAATAG7GGAC'CTTG7TC:aaaCTGGAACAACACTCAA CCCTATCTCGGTC7A 

:TCTTTTGATT7ATAAGGGA7r7TGGGGATTTCGG CCTATTGGTTAAAAAATGAGCTGATTTAACAAAAA 

' ' 1 r • » 5250 

-TTAACGC3AA-TAATTCT3TGGAArG-3TGTC AGT"AG3GTGT3GAAAGTCCCCAGGCTCCCCAGGCAG 

~~ " ' "~ 1 — ■ c 320 

Ava III 
Nsi I 

^ASAAGTaTGC^AAGCAT3CA':TCAArT A6-CASCAACCAGGT3TGGAAAGTCCrCAGr,gTrrcr^f 

L - ' 1 — c .290 

Ava III 
Nsi I 

AGGCAGAAoTATGrAAAGC^'GCiTC-Ci^r^GTCAGC^C CAT^G'CCCoCCCCT^CTCCGCCCA'C 

' ~ ' ' — — 6^60 
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Nco I 




AC TG GGC AC A AC AS ACAATCGGC 



T ,. Trn .^-.^rr fi T S 7TCCGSC7STCAGCSCAGGGSCGCCC_SGT ^ 



\0 



Pstl 



^TT-TTSTgAAGACCGACCTG-CSG-GCCCTGAATGAA; 



tgc aggacgaggcagcgcggcta^cstgg 5250 



Fsp I 



Bai i 



Pvu II 



Tih l 



' ' ^ BssH U 



CG*KTGTT=OC= AgC^ 6230 

r -II- 117 -r-'«« T «--------TTTTC-3SA--:ATCGACT6- 



AGG3GC 

Nco i 

GA 
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Nae I Rsr li 

I i 

GGCCGGCTGGGTGTGGCGGACCGCTATCAS3ACATAGCGTTGGCTACCCGTGATATTGC7SAAGAGCTTG 

— 1 ' ' 1 1 ' — 6370 

GC33CGAATGGGCTGACCGC77CCTCG73CT77ACGG7ATCGCC5CTCCC3ATTC3CAGCGCATCGCC77 
- — ■ ' . 1 

Asu II 

C7ATCGCCTTCTTGACGAGT7C7TCTGAGC3GGACTC7GG5G7TCGAAA TGACCGACCAAGCGACGCCCA 
' ' 1 ' : ■ 65 10 

ACCTGCC ATCAC3AGATTTCGATTCCACC3CCGCCT""C"ATGAAAGGTT5GGCTTC3GAATCGTTTTCCG 

— ' 1 ! 1 : 1 6560 

.Nae I 

•3GACGCCGGCTGGATGA7CCTCCA3CGC333GATCTCAT3CTGGAGTTCTTCGCCCACCCCAACTTGTTT 
■ ■ 5550 

ATTGCAGCrTATAATGGTTACAAATAAAGCAArAGCATCACAAATTTCACAAATAAAGCATTTTTTTCAC 
— ■ ' i , 5720 

Bsm I Sal I 

"GCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCGACCTCTAG 



6790 

C7AGAGC TTGGCGTAATCATGGTCATAGCT3 TTTCC TG T 3TGAA A TTG T TATCCGCTC AC AA T7CCAC AC 
' 1 - ■ . . — — . . 6S60 

AACATACGAGCC33AAGC ATAAAGT37AAA3CC T35GG7GCC7AATG AG T3AGC7AAC T C AC ATT AATTG 

' 1 ' — ■ 6930 

Pvu II 

CGTTGCGCTCACT3CCCGCT77CCAG7C333AAACC7G7:3 7GCCAGC*3CATTAiTGAATCGGCCAACG 



7CO0 

CGCGGGGAGAGGC3G7'73CG7ATT3GGC3CTCTTCCGCTTCCTCGCTCACTGACT:GCTGCGCTCGG7C 
1 — - i ' 1 . . , 7070 

GTTCGGCTGCGGC3AGCGGTA7CAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGoATA 

' ' — • — — — ■ 7 mo 

ACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGC 
— ' , , ' ■ ■ « 7210 

GTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAAC 
' 1 - 1 1 ' 1 7280 

CCGACAGGAC7ATAAAGATACCA3GCGTTT:CCCCTGGAAGCTCCCTCGTGCGC7CT^C73TTCC3ACCC 
■ ' ' ■ 1 ' : ■ 7350 

"GCCGCTAccGGArAccTGTccG::— tctcccttcgggaagcgtggcgctttctcaatgc^cacgctg 

— ' ' . — . 7*20 

ApaL I 

"AG3 TA TC TC AG TTCGG T 3T AGG TC 3 "*C 3C TCCAAGC"3GGC ~GT3*GC AC GmACCCCC Co' 7CA3CCC 
■ ■ ' • 7^90 

eACCGC T 3CGCCTTA y CC33TAACT^ 'CGTC"73^G7CC AACCCGGTAAGACACG^.: ATCGCC AC TGG 

■ - ' J : — — ■ 7560 
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Alwn I 

CA3CAGCCAC7GGTAACAGGA7-.1GCAGAGCGAGGTA7GTAGGC33TGCTACAGAGT~C7TGAAGT3G7G 
, . ; 7620 

gcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcgga 

■ . i . « ■ - 7700 

Mil A AG AG TTGG7 AoC7C7T3A7CC33CAAACAAACCaCC3CTGGTAGCGGTGG777T~77GT7TGCAAGC 
, 1 ■ ■ . . ■ 7770 

AGCAGA7 T AC GC 3CAGAA AAAAA3GATC 7CAAGAAGATCCTT7GATC 77TTC7ACGGGG7CTGACGCTC A 
, 1 . 7340 

.BspH I 

G7G3AACGAAAACTCACGTTAAGGGA777TGG7CA7GAGAT7ATCAAAAAGGATCT7CACCTAGAT:C77 
1 . , , . 1 , 7910 

— AAAT7AAAAAT3AAG T 7T~AAiTCAATCTAAAG7ATATA7GAGTAAACTTGG7C7GACAGTTAC:AAT 
. ■ : . 7560 

GC7TAATCAG7G A3GCACCTA7C TCAGCGATCT3TC TA7TTCG7TCATCCAT AG7TGCC 73ACTCCCCG7 
. 1 , . , 1 s , SQ50 

CGTG7AGATAACTACGA7ACGGGAGGGCTTACCATCTGGCCCCAGTGC7GCAA7GATACCGCGAGACCCA 
. ■ . , , 1 . ■ fii20 

CGCTCACC3GC 7CCAGA TTTA7C AGC A AT AAACCAGCCAGCCGGAAGGGCCGAGCGCAGA AG T 
: , , , , ■ . ^ 6153 
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Figure IS : Phase contrast images of MCF-7 cells transfected with pCB201 (upper) 
compared to mock (control) transfected MCF-7 cells (bottom). 

The control cells are spread out on the tissue culture plastic and exhibit few 
filopodia outgrowths. The transfected cells appear smaller because they are slightly 
rounded up and have multiple filopodia outgrowths (arrowhead) per cell. 
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Figure 19 : Phase contrast image of MCF-7 cells, transfected with pcDNA3 
(19a). pCDU4U9b). pCDU3 (19c) pCDU2 (19d) and pTB72(19e). 
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Figure 20 : F- aetin pattern (visualized with TRITC-Phalloidin) of 
MCF-7 cells iransfected with pcDNAJ.LaeZ (top panel) and with 
pCB20l (middle and lower panel I. 
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Figure 21 : F- actin pattern Phalloidin (visualized with TRITC-Phalloidin) of 
MCF-7 cells transacted with pcDNA3 (21a), pCDU4 (21b), pCDU3 (21c) 
pCDU2(21d)and pTB72(2Ie) 
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Figure 22 : Phase contrast image of N4 neuroblastoma cells, transfected with 
pcDNA3 (22a). pCDU4 (22b). pCDU3 (22c) pCDU2 (22d) and pTB72 (22e) 
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Figure 23 : F- actin pattern Phalloidin (visualized with TRITC-Phalloidin) of 
N4 neuroblastoma cells transfected with pcDNA3 (23a), pCDU4 (23b), pCDU3 
(23e)pCDU2(23d)and pTB72 (23e) 
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Figure 24 : Phase contrast images of small, medium sized and 
large foci induced in a monolayer of NIH-3T3 cells by 
transfection with pCB201 . 




Figure 25a. b, c: Chromosomal localisation of hu-unc-53/l by FISH 
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Figure 25d. e. f: Chromosomal localisation of hu-unc-53/2 by FISH 




Figure 26 : Expression pattern of Ha-unc-53/1 and Hu-unc-53/2 in normal 
human tissues and cancer cell lines by Northern blotting. 
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Setttnoa: Linear, gegaJn Sites Only. Standard Genetic Code 

CTAAATTCTAAGCCTTAATATTTTXiTTAAAATTCGCGTTAAAr 1 1 T Id I AAATCAC^CATTTTTTAACCAATAGOCCCAAATCGGCAAAATCCCTTAT 100 

AMTCAAMG^TAGACCGAGArACGCXrGAGTGTrGTrCCACTn 286 

CCGTCTATCAGGGCGArG^jCCCACTACGTGAACCATCACCCTto 308 

CCCCCUTTTAGAGCTTCaCGGGGAAAtfCGGCGA^ 4W 

CTCACCCTCCaCTAACCACCACACCCCCC&CCCTTAATCWCCCCTACACGQCGCCTCCCAnCGCCATTCACGCTGCC 500 

CCCTGCGCCCCTCTTCCCTATrACGCCAGCTGOCGAAAGGGGGATGTGCTGCAAGGCC^^ 600 

TAAAACG*CGGCCAGTWGCGCGCGTAATACGACTCACTATAGGGCGM^ 700 

CTGCAGQMTTCGATATCAACCTTATCGATACCGTCWCCTCGAGAGCGGGGG^ 800 

ATTATGCAMTGACAAAGAGAGGAGAATGGTGACCAGCCACCAGTGTC6 900 

!TTG{XTG7CTG7GTrCCAA>TATGGAGGCACACrG^ 1000 

ACTCAACTAGGAmCAAATAAiAAATTTCAAACAmAGATTTrm U00 

CTACGCTTTCAArrTrrGAATTTTCTCCAACTCATCAAC^ U00 
TCTTAAACTGTTTCTAG T AAAA AA GCTC CAAATT A TTTTT G C ATAATATT CCTTAGAAGTA C ATTTTTTATCAGC TCATC G GG AGCTTTGGCA CCTAT6G U00 

GCrCTGCTAGGCCCCGCCACAAAAACCTTCCGACCMCTGTAATTTTTCTCCCCAGACCCACGGM^^ W00 

GGAATTCAAATTTGGCACTTTAATCTTTAAAAAAATTGGAATTTCCAAAAAl 1500 

TAATTTGAGATCCGAATTGACACACATTTTTGGCTAGAGTTTGAAAAAGTGGGTCCCG 1£00 

GAAATAjUACCAAtaTATTGAACACAATGCTto 1708 

CACTATCACAJUUCGAATTCAATTCCGCCAAJUAGATGGT^ 1S00 

ACAAGG(>ATCGGAAGGAAACCGGCGACGGAAGGGAAGGGAAGGAAGCAAGTCGCCA>AAATG6AAGCGGG^ 1900 

CTTGTGCGCCAAATCTAGG<rr(JUTTCAXAG^ 2000 

CTTATTTCATAACCTAAATTAJUTGGACAG^CTTGAAAG^ 2100 
ACAGMCTACAGAATTTCAAAATTCATGAAACTTTC GTAATACCAACTATCrrGCCCATCTCC^CTGAAACATCTCAAATTTCATAACTAGTTTCAAATT 2200 

CAGAGGCATAAOAJUUAGCAACAAAA<KAJUU^TGCTCTT^^ 2300 

GMAUGACACACTMA^A>AC>mCGATATCCtCCTAGGACACAAmGCGCTAAACCAaTGTrCC 2480 

GCAACTTCTTCCCAtaCTTTGGCCCAAAAAGTATTGGAC^ 2500 

CACTGTTTGA>AAnACGC(rrGACCCAAACAAAATTT^ 2600 
TAATrTGCAATCTAAGCTCCAAACTATTTCAC^ 

GCACTTCCCGCATGAATAGA\TG T CTGTGTAACAGGGACT^ 2800 
TCaCCATCACCAAGTGATGaCArTAATTTTGA&Cm 

gtgtcaawgactacaactaoactaccaaaactcacttgtcaggatagccgttctgtgctccacgactcatgtca »00 
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fig 27 pWP3 Map n > 13621) S?ia and Sequence ^ 

ftTTTTGCCCCACATCCCAAC ACCCCCCTCACAGGTGTACCCCCTG<iATTCr GAAAT ATTTTGAGACTTATGiCCATTCAACTCTCGAAC C I f 1 1 1 1 r T C A 3100 
AATTCTAACGCCAATCTGCTCAAAAATTACTSTCO 3288 
CCUUrAGCTrAAGCATGTAGC^ATGTAAGCTGTra 3380 
TACTCTTTATTGGCTTGGT AAATATCAAGCCTATCTGCCTACATCATACCTnCTGCCTACACACCTGCCTATGCCTAAAACn AATACATCTCAAGAGA 3403 
ACTTrO^CAATTCCAAAAATTAAAATACTATAGATfOUAAA rTATCAAAAGAAAAAGTTTTTGGTACGACCTCCAAC GTTGGTTAACTTT AAACGAAAC 3S00 
AGTTCAGAAACCAAAAAA TCAAAAGTTTT C CTTTGACAAACATCATTTTTCTGGGATTTTTTAAAArTTATAATTTGAAGCT TTCAACAAAAAAAA 3600 
TCAAAGTTTCCAAAATTTACCTGATGTCTTAT^ 3700 
GCATGA6CCACAATGGAGTCTTC AG ACGCGGACGA (T OCTCATATCCT GCA AAATGGT CATTTGTCATAATAATTG GAGGTGGC 6G GTG GACTGACC AAA 3600 
CGACrTGACACGCGATGTGATTTTTC L l K I M I AGAAGAACATCGTACG ATAAGGTATTTTG't GTGCGGTrTTTTCTA rGATGTTTC AAAAA AA A TTTA 3000 
TCAGAAGTAAAAAAAACATAA GAA CT 6TACACAGAACATTCTGTAGATTTTGGG AT ACAAmTTATTAA7G«iGG<U6GC^GCACAATTTTTTTCCACT 4000 
TTTTATTACGAA TTMTTCTATTTTTGAATCTACACAA^ AAAaAACAAAGAAATTAAGAACAAG f I M IT 4100 
TGATAATAACAAJkAATAGGGGGTGGWGUKAGW^ 4200 
GCTCyuUWTTTAAATATTACTCG'TTAGTTA 3TACAACTCAGTTTAAACTT ACTCTCATGACAGCGGCCCC^ATTAITTTTGACACCA^CAAjnTCGTAA 4300 
TW7A6CGACCGKGCrCAGTr6GAATTCTACGAATGCTATrTGTATAGTrCATCCATGCCATGTGTM 4400 
CATGTCGTCTCTCTTTTCGTTCGG ATCTTT CGAAAGCGC AGAT7 GTCTGGAC AGGT AA TG677 GT CTGGTAAAAGGAC AGGGCCATCGCCAATTGGAGTA 4S00 
TTTTCrrGATAATGGTCTXTACTTG^ACGCTTCCATCTTC^TGT^ 4€00 
CCA AGTTTAAACTT ACAACTTT W TTCCATTCTTTTGTTT GTCTGCCATGATGTA TACATTGTCTGAGTTATAGTTCT ATTCCAATTTGTCTCCAAGAAT 4700 
GTTTCCATCTTCTTTAAAATCAATACCTTTTAACTCGATTCTATTAACAAGCGTATC 4800 
CGTTAGTTAGTACCGAACTGTTTAMCTTACGrGTCT^ 4980 
TCTT<UAAA^GTCATGCCGTTTCATArGATCT^ 5000 
ATCAGGGTTACTT ACTA TATATATGTTT AAA CTTACCCATGGA^ 5100 
TCACCTTCACCCTCTCCACTGACAGAJUUTTTCTr^ 5200 
TACTCATTTTTTCTACCGGTACCCTCCAAGCGT^ 33O0 
AACGTCCGTCCTGTCGATGCTCATTGATTGGAGCTTTTCAGCTTCTGGTTTTTC^ 3400 
CAGCATTCTTC&AAAJ^TACAAATArTTTAAGATTCCTAGTTTTGACA^ 5500 
rrC^GAAATCACACATTT AAAA I f T I T AAATTTTCCTCAAAGGAA ! 1 1 1 ! 1 1 I CAATTTCTGGAATT7TTO CAA TTTTGTTTTTTACAATTTTTGGAAT 5609 
TTTTCTCAAM MM ICCGG^AAATAGTGAATTTTTTGATAGTnTTGACTATGTCTCTfl mCTGAAAATATGAATCTTTTGATAI rTTTTTTGTTCAT 5700 
TTT7TGAAAAGTAGCTTTATTTGTGAGTTTTCAAAATAI I T T I T T 1GGAATTAAATTTCCC M 1 1 l CTGATTTTTTGTCGTTCCTCCGAACGAAAAAATC 3380 
GAATAAATTAAJUkTGTATTTTAAAATATGTTTTGTAAJlTTCACAAA 5900 
AAmAGAAATTTTTATAGAGTATTATTTTATTTTTTAACTCAA 0000 
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GAACCTTCTACA^TAAAATAATTGTACTCCTCCTTACTAT^ 61ae 

TC AC CGAC CGAAACATAAGTAAA GTTCA GAAAT AAACCT ACCATTTGTC AA CTTCTTATGTGCCATAATTG 6ACT AACAACTCCGATTCTTGGCTCCGTG 6208 

TCACCTGGCGGAACA6CTGGACG TX»CATCTTTTTTC GCTGTGCTTTTCACTCCTTrC ACTGCCAGTGTTGGT^ 6309 

AATArAGGAAGGCTCTTAGGAAGCCGAAGCCAGACATTAAAGTGrATCAAAAT^^ 

GGAAGAATGGAAAGACGGAAAGCGTATTGAATTTCAAGATATTTTC^ 65O0 
TGCAACTACTCGGGCCAAAGAATATGTACA TGGTTCAGGCCGGCCCAGTTTAGCCCATGCTTGATCAATGGCTTI GCCA^ 6M8 
AATTTAGGCCAAACTTGACAAGAAGCCTCTACCCAGAAACAAAAAATACTCACCTCMCAG 67Q0 
TCGGCTGT CT6ATAGC7 GTCACGATGGAGG CGTTAAGAGT AAGA TCGTCTG ATGACGG A(^CTTTTCGTCTGACGTTG AACT CTGAAAATTTTGAAGTTT 6880 
T A TAG AAGGGGA CGGTTTCAAA TG G G AT AAATTGAAACTT A^GTTTCATCTCTAAAAATTCAGTTGC ACAAAAAATTA GTTTC CAAOT ATAAAATGA G 6908 
ATTArTAATTCGAAAGCTTCTCGGCGACCACAAAAGTT^ 7888 
AMCATTTTTTCCTAAATATmAAAClTTTAACATCAA 7188 
C CTAAAAATGTATAAAW A GGTCC T CTG C AMGTTTTTTGT A CCAACAAA7 GGGCCGAnCACATTTTAT WCAGGTGTCCCAGAG ACCTCACTTTT ATA 7200 
AACATCCGAGACTATGIGAAGGTCACGAAAACAAAAAAAAACACTTCTCCTAAACAACGAACCTT 7308 
TCGA TG ATGACGTTGGCG A CGTGCTGTTG AATCCAGCGT ATCCCG AXTCTTCTTCA CTGCTCTTTGAGCATCGTT^C GAGTCTTCTTCAGAT^ 7408 
GGCGTCCGTTTTACGGTAGGAAACTTTTGGOGGAA^ 7388 
TCATAAATAAT AGAAGCCATAAGACAA TTCC GGAAA GT CAAAA6AATCTCGT ATATTTCAGT AGGAA CTTTTT AAAAAATAT AGCAG GTCTAAG AAGCTCT 7608 
A ATTT ATATG AAAAAAATAAATACT AGAAA CTGAAAAC GTAAA GTT AATTTTT C AAAAAGTGC ATCT CA CACAAAAATGTTTT G CAA TTTGTGCTCTTT C ?708 
ACTCTGTAGTAGCTCATTTTGAAAMCGAATTGAMAA 7 SOT 

CGACGTCGAAAAATGCTCGGAATATCACTTCCGTTGUU^CATrrTTCCGCGAA 7988 
TGAAAGAGTGTCAAGGAGAAAGAAG^AATGGGAGACAArrCCDLATTG 8808 
CGGAAOAAGTACGMTCGGGrrCTAAATATAGATGAGGCCCAGAGCTT^ 1100 
CTCACTTGTCTTTGGTCCTATA C AAAAA TT A TCAACTGT CAAAAT A TTTTTT A M IC! 1 Tl TCTATATCTGATATCTTCAATATGAACCAAGATATAAAT 4280 
CTTCAAA(^AAAATA(rrGGCATGCAATACTATCAGAGGAAAAAATCnrA 8388 
TTCCGATTTCAAATACTCACCGTACAAAGCTTCWCATAGACGTGGCACTrCCCAGCT^ 8488 
nTTC<UCAAAGTTT5TTGTTGAGGCACCGCCGCCGCCTrTCTCGTAGGT7GTGGGC7AT^ SS80 

rrruGCATTccAccAcacacaccGCTGTrAra S688* 

GCGGCTAGCTTTGAGCTTCCGATTTTTGTACTTGTAGCAAC>CCAACTAC 3738 
UATTCGAAATAGAGCTGTA C GTTGATC ATG A TTCTGAMTAA TA T AATTTCT AAAAAG A AGCCG GAAGGATCG G AT A C CTAAGCTCTTCGCAGATGTGGA 8888 
TATCCTTCGAGCCAACATTATTATTGCCACTCGAACGGCTCGACGGACGGAAT^ t988 
CGUUCCrCTTrGGCTTGATACCAATCTTTGATGAATCTGTAATTCUUACT 9888 
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AAGT AACA\AGAAACTG<UAACATTTAA AATTGAAMCTCTCTTGGAAA TC TAGT^ 9iaa 

GCAAGTCATTTCTGCTTGG GCTTCA GCTTATGCGCAACTAAGATCT AG AATAAGGCTaTAGTT C AGGCT7 AGG A7TTCCGAGTTCAATCTAAACGTTACA 9203 

CrACATTTATGCA&GCACTTCAGTTTTCTCTrTC^GCCCAATCTTGCCT 930e 

rn"(TrTTTTTTCTTCTTTTCCAAATCT^ 5^ 

TAMTATACTAAATTT7CCCGGCT7TCAGG<^GCCACAAAAAATAGGCGCG&^ g S00 

AWTatxcTt^mGAGACccAACGwcGAGATGAAMcccrcACAccccA^^ $6 ee 

a ATAti 1 1 1 lATCATACGAtrrGAGATGGATGAAGAGGTGAGAAACTTTTTTAJlTGGGCCAAAAAGCATGGA^ 9700 
AGATTGAATTTGATAGATAATCTGTGCTTGAATTAAAATrCTAACTGC^ Mea 
GGATATCTC CTGAGTAGCCATTCATAGC ATGGG AAATGTGACACG GAGCTTCCTA TA ACTTGGG G G A TTCA A G GTTTTG AGATG G AAAACA GTGAT AC AT 9900 
ATAAACATATTGTCAACAGTAGAAATAAA>TTTTAGCCATACAAT^ iee0€ 
AATATCb 1 1 1 1 u i AAA TAAACTTTTAAAAGTCGTCTTTATCCO AATTGC ACAAAATTAAATA TCA AAATA AA TGGCTTGAATAGT ACT AAAATCT ATT 16 lee 
CCACrnTGTGCTA^CCCAGAATGAGTWCGGAATAWAAAGAGCGCATAAAACCCGACATTAAT^ 102« 
CTTGAGACGATGAAAA^CAATTCTACACTCATACTACTACCAGCCAGTGTCATCGGCrCTATCTTTATTCA^ 10302 
TCTW TTCGAArTAATTGACCGTOCCCCGTGTAGTTTCCTTCCACACA UMOC 
GAGATGCAACGGGAGGAGGAGGCGGGAGAAGGAGGGAGAAGAGGGACAACCTCTCCCT^ 1030C 
AAAAGTCATTATGAAGCCGAGAAATTGATTTTGGTGGGCACTTTTCGGGCMGGGG 

AACATGCTGGCGCTGAGGGTTrGAAATACnTTTTTTTACTT^ 1270C 
TCTTTTAAACATTTGAGAATAGCACAAATTTGTTTCAGAT^TTATTTCCAGTA IO80C 
TrGGATCTATTACAAAGCTCTCAACTGACAAWnTCTGCAAAATACTTTTCTACATT^ 1090? 
ATATAAGCCGTCATATTAGAGCAATAAAGCTGGAAGrrTTCCAMAAAACTACATTrTTATCATCTT 1100* 
AAATAAAATrCTTACCAATTTrCGATATTCTrGACTtrrGGACTCTGAAGCCTGCA 1110? 
GCrrTGO^GACGTGGCWCACCTGGCGAGGGTAATCTCiAAMTGGAATTGTATTGCA^ 11208 
GGCAAATAAaAAACTAAATAATGTTCCTmAATATmaCAATgUAM 1130e 
TTAAAGAACACATTTTTGTAGTCGTAA ACT CAAAATTAAACTCACTTAGA AACCGCGG (TrGGCATAATGGATGTGGGTAGTTGCTCCAATTTCTTCTCA T 1140C 
CTCGAXGGGGGCCCGffrACCCAGCrTTTGTTCCCTTTAGTGACGCTT AATTG CGCGCTTGGCGT AATCATGGTCATAGCTGTTTCCTCTGTCAAATTGTr 11508 
A TCCGCTCaCAATTCCaCACAACATaCGAGCCCG AAGCATAAAGT6TAAAGCCTGGGGTGCCTAATGAGTCACCTAACT CACATTAATTGCCTT6CGCTC 1160C 
ACTGCCCG<TrTCCAOTCG«iAAACaCTCGTGCCA(KrTG<>rrAATGAATCGGCCAACGCGCGGGGAC^ 1170« 
TCCTCGCTCACTCACTCGOGCGCTCGCTCCrrCGGCTt&GGCGAOvGGTATCAG 11804 
ACftUGGA^GMUTGTGAGCA^GGCtt(KAA^Gtta6GAAeC^ 1190S 
<^KA<^AAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAW^ 1200* 



WO 98/24810 



247/270 



PCT/EP97/06956 



ddndora&g, 27 ft&emter 1S97 16 52 

jjg M ap fl > 1362^ Sfo Saouoma 



(nrccwcccTKcccTTAccGCATACCTCTcccKCTTTCTccrrrcGW 121W 

A^OTCGCTCCAA^CttCTG^ u?Cfi 
ACACGACTTATCGCCACTGGCAGCAGCCACTG GTAACAG GATTAG CAGAGC GAGGT ATGTAGGCGGTGCT ACAGAGTTClTGAACrGGTCC CCTAACTAC 123W 
GGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAG^ 

CTG^ACCGGTGGTrrTTTTGTTTCCAAGCAGCAGATTA CGCGC AGAAAAAAA GGATCTCAAGA AGATCCTTT G ATCTTTTCTAC G G6 GTCTGA CC CTCA 12S0C 
CTGGAAC CAAAA CTCA CGTTAAG GGATTTTGCTCATGAGA TT ATCAAAAAGGATCITCA CCTA GATCCTTTT AAA TT AAAAATGAA (HTTTAAATCAA TC XZW 
TAAACTATATATGAGTAA^CTTGGTCTGACAGTTACCAATGCT^ U7M 
GACTCCCCOTCCTGTAGATAACTACGATACGGGAGGGCrTACC^ U UU 
ATOUiCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGCT UM 
GTAACTAGTTCGCCA6TTAATACTTT6C GCAACCTTGTTGC C ATTGCT ACAGGCArCGTGGTiTrCACGCTCGTCCrrTGGTATGGCTTCATTCAGCT CC G UOC« 
GTTCCCAACGATCA^GGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCC^CGGTCCT uiflc 
AGTGTTATCACTCATCCTTATGGCAGCACTGCATAATTCT^ 

TTCTGAI^TAGTGTATGCGGCGACCGAGTTGCTCTT GCCCG GCGTCA AtACGGGATAATACCGCGCCAC at agcag AACTTTAAAACTGCTCATCATTG 133GC 
^^^CC^C'TTCGGGGCGAAAAC TCTCAAGGATCTTACCGCTGTTCAGATCCAGTTCGATGT AACCCACTCGT GC ACCCAACTGATCTTCAGCATCTTT 
TACTTTCACCAGCGTTTCTGGGTGAGCAAAMCAGCAAGGCAAAATGCCCCAAAAAAGGGAATAaGGGCGAW J353C 
CTTTTI CAATATT ATTGAAGCAT7T ATCAGGGTT ATTGTCTCATGA GCC C ATACATATTTGAATGTATTT A GAAAA ATAAACAAATAGGGGTTCC GCGCA IJOW 
CATTTCCCCGAAAAGTCCCAC 13621 



WO 98/24810 248/270 PCT/EP97/06956 



VaKj^ 22november ,996 16:02 - Page , of 1 

P6VR£ QJ. 



SIGNATURE SEQUENCES : 

Different signatures can be used to define to identify the UNC-53 gene 

Aminoacids are listed in one letter code 

X equals any aminoacid 

X(3,5) equals 3 to 5 X ? s 

(D,E) means D or E at a given position 

The signatures should be used to screeen * A»t a u** • . . 

of conservative substitutions database using a weight matrix 

BLOCK A : 
BLOCK B : 

KXKKSWXXXXXXXXFXK 

BLOCK C : 
LARGE Fa^h y - 

yFRTFRPA TFF:Mny . 

BLOCK D : 
LARGE^MILi: : 

VFRTFRPA T FFAX^ f y ■ 

BLOCK E : 

GXXGXGKS/T 

and 

F<K.R ) MXX X SNX(3.8)GF( 1 .L 1 VK I .L.V,< R / K ,V(I.L.V )( R.KXR.KK R .KKV (D .E 1 



(W/F) (D/E) DSSS (V/L/I) SSGISD (T/N) 
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j 29 pEGFPsac (1 > 5100) Site and Sequence J ^ 

Enzymes : 72 of 1 48 enzymes (Filtered) ' P * 

Settings: Linear, Certain Sites Only. Standard Genetic Code * 



{ Bgl I 

TAG TTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCG TT AC ATA A C TTACGGTAAATGGCCCGCCTGGC TGACCG 

ATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGGC 
• LL IV1NYGV1SS . PI VGVPRYI TYGKVpAVLT 

j* at » Aat II 

CCC AACGACCCCCGCCCATTGACGTC AA TAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATG GGTGGAGTATTTACGGT 

gguTtgctgggggcgggtaactgcagttattactgcatacaagggtatcattgcggttatccctgaaaggtaactgcagttacccacctcataaatgcca 200 

A Q R P P P I D V N N 0 V C S H S N A N R DFPLTSMGGVFTV 

Bg' ' I Aat II Bgl I 

aaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaa atggcccgcct 
tttgacgggtgaaccgtca tgtagttcacatagtatacggttcatgcgggggataactgcagttactgccatttaccgggcggaccgtaatacggg tcat 300 

N C P L G S T S S V S Y A K Y A P Y R 0. R MARLAL CPV 

SnaB I Nco I 

CATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGA TGCGGTTTTGGCAGTACATCAATGGGCGTGGA 
GTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCCATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCT "°° 
HDL MGLSY LAVHLR 1 SHRYYHGDAVLAVHQVAV 



Aat II 

TaGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTC AATGGGAGTTTGTTTTGGCACCAAAATCAACGG GACTTTCC AAAATGTCGTA 
ATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGC AGTTACCCTCAAAC AAAACCGTGGTTT TAGTTGCCC TGAAAGGTTTTACAGC at 
1 A V ; L T G , 1 S K S P P H , R Q y E F V L A P KSTGLSKMS 

Nhe I Ec47 

AC A AC TCCGCCCCATTGACGCAAATGGGCGGTAGGC5TGTACGGTGGGAGGTC TATATAAGC AGAGC TG GTTTAGTGAACCG TCAGATCCGC TAGCGC TA 
TGTTGAGGCGGGGTAAC TGCGTTTACCCGCCA TCCGCACATGCCACCCTCCAGATATATTCGrc TCGACCAAATCACTTGGCAGTCTAGGCGATCGCGAT 
QLR P|DA NGR.AC TVG GLYKQS VFSEPSOPLAL 

NCO I 

ccggt:gccaccatggtgagcaagggccaggag:tgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaagttcagcc 



GGCCA3CGGTGG ^^CC^AC^TCC T T C C C G C^T C C T C 3 A C A A G ^ .^S.™^?.^.'^!r-^^ < ^^ A ^^ ji ^' ^" TCGAC C TGCCGC TGCAT T TGCCGGTGT TC AAG TCuC 



7 CO 



° V * T M v 3 * G E E L F T G V V P I L V £ L 0 GOVNGHKFo 
■GTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGC TGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCC CGTGCCCTGGCCCACCCTCGTGA: 



V S G E G E G OATYG <LTLKF|CTTGKLPV?V 



P T C V T 



WO 98/24810 250/270 PCT/EP97/06956 
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fig 29 p£GFPsac (1 > 5100) Site and Sequence 9 

CACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAA5CAGCACGAC TTCTTCj^AGT CCGCCATGCCCGAAGGCTACGTCCAGGAG 
TGGATGCCGCACGTCACC AAGTCGGCG ATGGGGCTGGTG TACT TCGTCGTGCTG A AGAAG TTCAGGCGGTACGGGC TTCCGA TGC AGGTCC TC 



TLTYGVQCFSRYPDHMKQHDFFKS 



AMPEGYVQE 



J<spl 

I 

CGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTG AACCGCATCGAGCTGAAGGGCATCG 



R T I F F K 0 P G N Y K T R A E V K F E G 0 TLVNR IELKG I 
ACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTAC aacagccacaacgtctatatcatggccgacaagcagaa gaacggcatcaa 



I ICC 



0 F K E D G N , 1 L G H K I E Y N Y N S H N V Y I MA D K 0 K N G I k 

ggtgaacttcaagatccgccacaacatcgaggacggcagcgtgcagctcgccgaccactaccagcagaacac ccccatcggcgacggccccgtgctgctg 



I20C 



V N F K ^ ^ H N I E D G S V Q L A 0 H Y Q Q N TPIGOGPVLL 
CCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGC AAAGACCCC AACGAGAAGCGCGArCACATGGTCC TGCTGGAGTTCGTGACCGCCGCCGGGm 



PON HYLSTQ S ALS KDPNEKROHMVLLEFVTAAG 



f S P M,! pcoN. 
TCACTCTCGGCATGGACGAGCTGTACAAGTCCGGACTCAGATCTACGTCAAATGTAGAATrGATACCAATC TACACGGATTGG GCCAATCGGCACC TTTC 



-C.e.unc53 sac 



' T L G " 0 E I Y K S G L R S T S N V £ L IP \ Y T 0 V A N R H L 3 

N'U 1 JEcoR I 

GAAGGGC AGCTTATCAAAG TCGATTAGGGATATTTCC AA TGATTTTCGCG ACTATCGACTGGTTTCTCaGC tt at taatgtg atcgttc cgatc aacgaa 



~~* " C.e.unc53sac — ■ — — — 

* G S L S K S [ R Q t S N 0 F R DYRLVSQL I N V I V 3 I fl E 
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fig 29 pEGFPsac (1 > 5100) Site and Sequence Page 3 



Bsm I 

i 

TTCTCGCCTGCATTCACGAAACGTTTGGCAAAAATCACArCGAACCTGGATGGCCTCGA AACGTGTCTCGACTACCTGAAAAA 



TCTGGGTCTCGACTGCT 



-C.e.unc53 sac 



FSPAFTKRLAK ITSNlDGLETC 



UOYLKNLGLOC 



f coRV Pvull Ksp632l Hind III 

CGAAACTCACCAAAACCCATATCGACAGCGGAAACTTGGGTGCAGTTCTCCAGCTKTCTTCCTGCTC 

^•^^^ !7C,; 



■v:*fW.+.rjryJS3iiK: 



- C.e.unc53 sac 



S K L T K f I DS GNLGAVL QL LFLLSTYKQKLRQLf 



Sst II 

Xma I 
5m a I 



j pam HI jXba I Bel I 

AAAAGATCAGAAGAAATTGGAGCAACTACCCACATCCATTATGCCACCCGCGGGCtt 

TTTTC TA GTCTTCTTTAACCTCGTTGATGGGTGTAGGTAATACGGTGGGCGCCCGGGCCCTAGGTGGCC TAGA TCTATTGACTAG TATTAGTCGGT ATGG '** 

- ...... .\ 

x > 



•'Ssl}'*".' •» 



3: 



- C.e.unc53 sac 1 

KOO KKI EOLPTS I MP PAGPGSTGSR . L I I [ SHT 



P a 1 pstn I ^ pa I 

ACATrrGTAGAGGrTTTACrTGCTTTAAAAAACCTCCCACACCTCCCCCTGAACCTGAAACArAAAATGAATGCAATTGTTGTTGTTLcTTGT TTATT; 
TGTAAAC ATC TCCAAAATG AACGAAATTTTTTGGAGGGTGTGGAGGGGGAC TTGGACTTrGTATTTTAC TTACGTTAACAAC AACAATTGAACAAA TZAZ '** 
T F V E V L L A «■ K . N L P H L P L N L .< H K » N A I V V V N L F I 

Bsm I 

CAGCTrArAATGG TTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTrTTCACTGCArTCTAGTTGTGGTTTGTCCAAACTCATCAA 
GTCGAATATTACCAATGTTTATTTCGTTATCGTAGTGTTTAAAGTGTTTATTTCGTAAAAAAAGTGACGTAAGATCAACACC AAACAGGTTTGAGTAGTT *** 
AAYNG YK . SNS I T N F T M K A F F S L HSSCGLSKL I M 

.Mlu I ssp I 

TGTATCTTAACGCGT AAATTGTAAGCGTTAATATTTTGTrAAAATTCSCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAA TAGGCCGAAATCG; 
ACATAGAAITGCGCATTTAACATTCGCAATTATAAAACAATTTTAAGCCCAATTTAAMAACAAtUAGTCGAGTAAAAAATTGGTTATCCGGCTTTAG." 
V S • * V " C K » • y F V , * ' » V K FLLMQL IF . P I G P M F 
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lig 29 pEGFPsac (1 >5100) Site and Sequence ge S 

Bsrl 

CAAAATCCC TTATAAATCAAAAGAATAGACCGAGATAGGGTT^ 

GTTfTAGGGAATATTTAGTTTTCTTATC TGGCTC TATCCCAACTCACAACAAGGTCAAACCT TGTTC TCAGGTGATAATTTC TTGCACC TGAGGTT3CA2 
0 N P > ; ! * R | D , * 0 » V £ C C S S L 6 Q E S T I K £ R G L Q P ^ 

Dra ill 

AAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCTAATCAAGTTTTTTGGGGTC GAGGTGCCGTAAAGCACTAAATCGGA 

tttcccgctttttggcagatagtcccgctaccgggtgatgcacttggtag tgggattagttcaaaaaaccccagctccacggcatttcgtgatttagcct 

QR AKNRL SGR VP TT. T I T L I KFFG V EVP . STKSE 

Nae I 

] 

ACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGC^ 

TGGGATTTCCCTCGGGGGCTAAATC TCGAACTGCCCCTTTCGGCCGCTTGCACCGCTCTTTCCTTCCCTTCTTTCGCTTTCC TCGCCCGCGATCCCGCGA ^ 
P ; R . E P P 1 ; S L T G K A G E R G E K G R E E 5 E R S G R . G A 

f»P' J<S P I 
GGCAAGTGTAGCGGTC ACGCTGCGCGTAACCACCACACCCGCCGCGCrTAATGCGCCGCTACAGGGCGCGrCAGGTGGCA CTTTTCGGGGAA 
CCGTTCACATCGCCAGrGCGACGCGCATrGGTGGTGTGGGCGGCGCGAATTACGCGGCGATG TCCCGCGCAGTCCACCGTGAAAAGCCCCTTTACACGC3 
G K C S G H A A R N H H T R R A CAATGRV R V'HFSGKCA 

P S P H 1 Ssp I Ksp632l 

GGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAA TGCTTCAATAAT ATTGAAAAAGGAAOh 

CCrrGGGGATAAACAAATAAAAAGATTrATGTAAGTTTATACATAGGCGAGTACTCTGTTATTGGGACTATTTACGAAGTTATTATAACTTTTTCCTTCT ^ 
9 * P Y L F ! , F L " T F X V V S A H E T I T L ! N A S ! I L K K E \ 



OxaN I pvu II 

i i 



?ph I 
! Ava 111 
| JMsi I 

gtcctgaggcggaaagaaccagctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcaggcagaagtatg caaagcatgLW 

CA3GACTCCGCCTTTC TTGGTCGAC ACC fTACACACAGTCAATCCC AC ACC TTTCAGGG3TCCGAGGGG TCGTCCGTCTTCAFACGT TTCGTACGTAGA^ 1 
3 • G G K " Q L V N V C Q . L G C G K S P G S P A G R S M Q S M H l 

Sph I 
j Ava III 
| Nsil 

AATTAGTCAGCAA CCAGGTGTGGAAAGTCCCC AGGC TCCCCAGCAGGC AGAAGTATGC AAAGCATGCATCTCAATTAGTC AGC AACCATAG TCCCGCCCC 
T ^^^rCAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTC r TC ATACG TT TC GTACG TAG AG T TAATC AG TCGTTGG TATCAGGGC 3GG3 * 
* ' S A T * C G K S . P G S P AGRSMOSMHLN.SATIVPF 



Bsrl Nco I 

I I 



Bgl I 
Sh I 



rGAGGCGGGTAGGGCGGGGATTGAGGCGGGTC AAG5CGGG TaAGaGGCGGGGTACCOaC TGATTAAAAAAAATAAA TACG TC TCCG'jC FCCGoCGGa-1: '"' 
^ PP | PP L TP PSSAH S P P H G . L I F F | Y A £ A E W 



WO 98/24810 253/270 PCT/EP97/06956 
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fig 29 pEGFPsac (1 > 5100) Site and Sequence age * 

Stu I 

j /Wr II cia I 

< I I 

GGCCTCTGJGCTATTCCAG^GTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAGATCGATCAAGAG ACAGGATGAGGATCGTTTCGCATGA' 
CCGGAGACTCGATAAGGTC TTCATCACTCC TCCGAAAAAACC TCCGGATCCGAAAACG TTTC TAGC TAGTTCTCTGTCCTAC TCC TAGCAAAGCGTAC TA ^ 
A S E , L F Q < • G G F F G G L G F C K DRSRORMR I V S H 0 



3spM I Xma III 

TGAACAAGATGGATTGCACGCAGGrTCrCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCAC AACAGACAA TCGGC TGCTC TGATGCCGCC 
ACTTGTTCTACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGAfAAGCCGATAC TGACCCGTGTTGTC TGTTAGCCGACGAGAC TACGGCGG ^ 
• T R V I A R ft F S G R L G G 6 A IRL .LGTTONRLL .CP 

Nar I 

I j Bbel f<spl 

GTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTC TTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAAC TGCAAGACGAGGCAGC GCGGCTA TCGT 

CACAAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTTCTGCTCC 

R v R A V S A G A P G S F C Q Q R P V R C P E . T A R R G S A A I V 

Fsp I 

M I j p u II Tth I 

ggctggccacgacgggcgttccttgc'gcagctgtgctcgacgttgtcactgaagcgggaagggactggctgctattggg t 

CCGACCGG7GCTGCCCGCAAGGAACGCGTCGACACGAGCTGC AACAGTGACTTCGCCCTTCCCTGACCGACGATAACCCGC TTCACGGCCCCGTCC TAG A ^ 
A GH DGR SLRSCAR RCH . SGKG LA A I G R S AS A 6 3 

BspM I 

CC TG TC A TC TCACC T TGC TC C TG CCG AG AA AG TATCC AT C A TGGCTG A TGCAATGCGGCGGC TGCATACGCTTGATC CGGCTACC TGCCCATTCGACC AC 
GGACAGT AGAGTGGAACGAGGAC GGCTCTTTCATAGGTAGTACCGACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCTGGTG ^ 
° V 1 S P C S C R E, S I H H G . C N A A A A Y A S G Y U P I R P 

Ear I 

Ksp632l 

CAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAG 

gttcgctttgtagcgtagctcgctcgtgcatgagcctaccttcggccagaacagctagtcctactagacctgcttctcgIagtccccgagc 

P S E T S H R A S T Y S 0 G S R S C R S G SGRRA SGAR A S R 

pPh I pco I 

AACTGTTCGCCA GGCTCAAGGCGAGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCAT 

TTGACAA 3CGGTCCGAGTTCCGC ^CGTACGGGC TGCCGC TCC TAGAGCAGC AC TGGGTACCGCTACGGACG AACGGC TTATAGTACCACC TTTTACCGGC ^ 
TVR QAQ GEHARRR GSRRDPVR CL LAE Y H G 3 K V P 

| Nao 1 , Rs ' 11 Ksp632l 

- r7T ^^'*''-' <s> ^ CATCGAC TGTGGCCGGCTGGGTG TGGCGGACCGCTATC AGGACATAGCGTTGGC TACCCGTGATATTGCTGAAGAGC TTGGCGGC GAA 

gaaaagacctaagtagctgacaccggccgacccacaccgcctggcgatagtcctgtatcgcaaccgatgggcactataacgactIctcgaaccgccgctt ::7: " : 



F v I H R LVPAGCGGPL SGHSVG 



Y P Y C . R C v R p 



WO 98/24810 254/270 PCT/EP97/069S6 
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fig 29 pEGFPsac (1 > 5100) Site and Sequence ge 

TG3GCrGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTrCTArCGCCTTCTTGACGAG rTCTrCTGAGCGaGACTCT 
ACCCGACTGGCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAACTGCTCAAGAAGACTCGCCCT3AGA ^ 
M G • P L P R , A L R Y R R S R F A A H R L L S P S . RVLUSGTL 

^su II BspM | 

GGuGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCC ATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGC rTCCGAATCGTTTTC 

CCCCAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGrAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAS ^ 
G F E M T Q Q A T P N L P S R 0 F 0 S T A A F V E R L G F G I V F 

/Mae I Kspl Avf „ 

1 ' i 

CGGGACGCCGGCTGGArGArcCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCTAGGGGGAGGCTAACTGAAACACGGAAGGAGAC AATAC 
GCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGATCCCCCTCCGATTGACTTTG TGCCT7CCTCTGTTATG ^ 
RDA GWM1LOR G0L MLEFFAH PRG RLT E T R K £ T | 

J<spl 

CGGAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACGCACGGTGTTGGGTCGTTTGTTC ATAAACGCGGGGTTCGGTCCCAGGGCTGGC A 
GCCTTCCTTGGGCGCGATACTGCCGTTATTTTTCTGTCTTATTTTGCGTGCCACAACCCAGC AAACAAGTATTTGCGCCCCAAGCCAGGGrcCCGACCGT *"* 
p EGTRAM TAI KRONKTH GVG SFVHKRGVR S 0 G W H 

TCGATACC CCACCGAGACCCCATTGGGGCC AATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCT C 
GAGACAGCTATGGGGIGGCTCTGGGGTAACCCCGGTTATGCGGGCGCAAAGAAGGAAAAGGGGTGGGGTGGGGGGTTCAAGCCCACTTCCGGGrCCCGAG ^ 



SVOTPPRpHWGO 



Y A R V S S F SPPHPPSSGEGPGL 



jWwnl pxaNl pral pra| 

GCAGCCAACGTCGGG GCGGCAGGCCC7GCCATA3CCTCAGGTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATC TAGG 
CGTCGGTTGCAGCCCCGCCGTCCGGGACGGTATCGGAGTCCAATGAGTATATATGAAATCTAACTAAATTrTGAAGTAAAAATTAAATTTICCTAGATr,' ^ 
AAN VG AAGPAlASGYSYIL. lOL KLHF . F K R 1 

BspH I 

TGAAGATCCTTTTTGATAArCTCATGACCAA AATCCCTTAACGTGAGTTTTCGTTCCACrGAGCGTCAGACCCCGTAGAAAAGATCAAAGG ATCTT:TT^ 

AC " TC rAGGAAAAACTATTAGAGTACrGGTTTTAGGGAATTGCACTCAAAAGCAAGGTGACTCGCAGTclGGGGCATCTTTTCTAGTTTCCTAGAAGAA-' ^ 

- - ' L ' ° N L H T K 1 P ■ » 6 P. S F H , A S 0 P V E K , K G S S 

AGArcCTTTTTTTCTGCGCGTAArCTGCTGCT TGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTT TT 
ICTAGGAAAAAAAGACGCGCATTAGACGACGAACGTTTGTTTrTTTGGTGGCGATGGTCGCCACCAAACAAACGGCCTAGTTCTCGATGGrTGAGAAAAA ^ 
° " F F L R V ' C C L 0 T K K P PLPAVVCLPOQELPTLF 



Bsrl 

ccgaaggtaactggc'ttcagcagagcgcagat accaaatactgtccttctagtgtagccgtagttaggccaccacttcaagaa ctctgtagcaccg-.-ta 

gg:ttccattgaccgaagtcgtctcgcgtctatgg-ttatgacaggaagatcacatcggcatcaa T ccggtgg7gaagttcttga G acatcgtggcgg^ 
3 " V T G F s ft A 0 ' p v t U V . P LGHHF knsvapp 



WO 98/24810 255/270 PCT/EP97/06956 
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fig 29 pEGFPsac ( 1 > S 1 00) Site and Sequence 98 

Alwn I 

catacctcgctctgctaatcctgttaccag^ 

GrATGGAGCGAGACGATTAGGACAATGGTCACCGACGACGGTCACCGCTATTCAGCACAGAATGGCCCAACCTGAGTTCTGCTATCAATGGCCTATrCCG 
TY LAL > l.tLP VAAAS GO KSC LTGLDS R R L P D K A 

GCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATA 

CGTCGCCAGCCCGACTTGCCCCCCAAGC ACGTGTGTCGGGTCGAACC TCGCTTGCTGGATGTGGCTTGACTCTATGGATGTCGCACTCGATACTCTTTCG ^ 
0 3 S G • T G G S C T Q P S L E R T T Y T E LRYLQREL.E3 

GCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGG GAGCTTCCAGGGGGAAACGCCTGGT 
CGGTGCGAAGGGCTTCCCTCTTTCCGCCTG ^CCATAGGCCATTCGCCGTCCCAGCCTTGTCCTCTCGCGTGCTCCCTCGAAGGTCCCCCTTTGCGGACCA ^ 
A TL P E GRK AO R YP VSG RVGT GER TR ELPGGNAV 

ATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGG 



| AAAAACGCCAG CAACGC 

TAGAAATATCAGGACAGCCCAAAGCGGTGGAGACTGAAC TCGCAGCTAAAAACACTACGAGCAGTCCCCCC6CCTCGGATACCTTTTTGCGGTCGTTGCG ^ 
Y L Y S P V G . F R H > ; t E R R F I . C S S G G R S L V K N A S N A 

Ava III 
fisi I 

ggcctttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggata accgtattacc^ 

CCGGAAAAATGCCAAGGACCGGAAAACGACCGGAAAACGAGTGTACAAGAAAGGACGCAATAGGGGACTAAGACACCTATTGGCATAATGGCGGTACGTA 
A / L . R F L . A F C V P F A HMFFPALSPO SVONRITAMH 
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1 



TAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATA TGGAGTTCCGCGTTACATAAC TT ACGGTAAAfGGCCCGCCTSGCTGACCG 
ATCAArAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAArGCCATTTACCGGGCGGACCGACTGG: ^ 



LLIVINYGVISS.PIYGVPRYIT 



YGKVPAVLT 



^ « Aat II 

CCCAACGACCCCCGCCCATTGACG^ 

gggttgctgggggcgggtaactgcagttattactgcatacaagggtatcattgcggttatccctgaaaggtaactgcagItacccac 200 

AQ RPPP l OVN NQVCSHS NANRD FPLT S M G G V F TV 

f9" [Wei Aat II Rgl I 

aaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcA^ 
t ™gacgggtgaaccgtcatgtagttcacatag^ 300 

N C P L G S T S S V S Y A K Y A P Y ( R Q . R . M A R L A L C P V 

jSnaB I Nco I 

CATGACCTTATGG GACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCrATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCG TGGm 

gtactggaataccctgaaaggatgaaccgtcatgtagatgcataatcagtagcgataatggtaccactacgccaaaaccgtcatgtagt ttCW 

H D L M G L S Y L A V H L R t S H R Y Y H G Q A V I A V/ H Q V A V 

|Aat II 

TAGCGGTTTGAC TCACGGGGATTTCCAAGTCTCCACCCCATTGACGTC AATGGGAGTTTGTTrTGGCACCAAAATCAACGGG ACTTTCCAAAATGTCGTh 

atcgccaaactgagtgcccctaaaggttcagaggtggggtaactgcagttaccctcaaac aaaaccgtggttttagttgccc tgaaaggttttacagcat 530 

1 A - V ; 1 T G 1 S K S P P " • ft 0 V EFVL APKSTGLSKMS 

1 iJ 1 ■ ■ i - * 

Nhe I Ec47 

ACAACTCCGCCCC ATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATC 
TGTTGAGGCGGGGrAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATATATTCGTCTCGACCAAA TCACTTGGCAGTCTAGGCGAfCGCGAT 
° L » P 1 ° A N G R • * C T V G G L Y K Q S W F S E P S 0 P L A l_ 



Nco I 

" TGGACGGCG ACGT AA ACG 3CCACAAG r TC AGC; 



CCGGTCGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTC ACCGGGGTGGTGCCCATCCTGGTCGAGC 



GGCCAGCGGTGGTACCACTCGTTCCCGCTCCTCGACAAG TGGCCCC ACCACGGGTAGGACCAGC TCGACC TGCCGC TGCATTTGCCGGTGTTCAAG TCG~ ?& 



— — eGFP.C.e.unc53 

PVATMVSKGE 



E L F 7 G v v P t L V E L 0 G OVNGHKFS 
TGTCCGGCGAGGGC GAGGGCGATGCC ACCTACGGCAAGC |G^CCCTGAAGTTCATCTGCACC ACCGGCAAGCTGCCCGTGCCCTGGCCCACCCT CGTGAC 

acaggccgctcccgTtcccgctacggtggatgccgttcgactgggacttcaagtagacgIggtggccgtIcgacgggcacgggaccgggIg eC0 



" eGFP.C.e.unc53 

v s G I G E G oatygxltlkf ictt 



gklpvpvpflvt 
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C ACCC TGACCTACGGCGTGCAGTGCTTCAGCCGC TACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCC ATGCCCGAAGGCTACuTCCAGGAc 
GrGGGACTGGATGCCGCACGTCACGAAGTCGGCGATGGGGCTGGTGTACTTCGTCGTGCTGAAGAAGTTCAGGCGGTACGGGCTTCCGATSCAGGTCC K " ^ 



-eGFP.C.e.unc53 



T L T V G V 0 C F S R Y P Q H M K Q H 0 F F K SAMP E GYVQE 



j<Spl 



CGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCG AGCTGAAGG GC ATCG 
GCGTGGTAGAAGAAGTrCCTGCTGCCGTTGATGTTCrGGGCGCGGC TCCACTTCAAGCTCCCGC TGTGGGACCACTTGGCGrAGCTCGACTTCCCGTAGC 



-eGFP.C.e.unc53 



R T IFFKDDGNYKTRAE VKFEGOT 



LVNRIEUKGI 



ACT 



TCAAGGAGGACGGCMCATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCG ACAAGCAGAAGAACG6CATCAA 
TGAAGTTCCTCCTGCCGTTGTAGGACCCCGTGTTCGACCTCATGTTGATGTTGTCGGTGTTGCAGATATAG TACCGGCTGTTCGTCTTCTTGCCGTAGTT 



~ " ~" — — — ^— — — eGFP.C.e.unc53 

0 F K E 0 G N I L G H K L E Y N Y N S H N V Y I M A 0 K 0 K N G I K 

GGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGG CCCCGTGCTGC 
CCACTTGAAGTTCTAGGCGGTGTTGrAGCTCCTGCCGTCGCACGTCGAGCGGCTGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACGAC^ ,2C,; 



; "*"■ eGFP.C.e.unc53 - 

V N F K IRHN IEDGSVQLADH 



YQQNTP IGOGPVL 



CCCGACAACCACTACCTGAGCACCCAGTCCGCCC7GAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCC TGC TGGAGTTCGTGACCoCCG CCGGCiA 
GGGCTGTTGGTGArGGACTCGTGGGTCAGGCGGGACrCGTTTCTGGGGTTGCTCTTCGCGCTAGTGTACCAGGACGACCTCAAGCACTGGCGGCGGCrrT ^ 



-eGFP.C.e.unc53 



PDN HYLSTQS ALS KDPNEKROHMVLLEFVTA 



A G 



Asu II 

:coN I 



0spM II Bgl || 

TCACTCTCGGCATGGACGAGC ^^^CAAGTCCGGACTCAGATCTACGTCAAATGTAGAATTGATACCAATCTACACGGATTGGGCCAArCGGCACC TTTC 
AGTGAGAGCCGTACCTGCTCGACATGTrCAGGCC TGAGTCTAGATGCAGTTTACATC TTAAC TATGGTTAGATGTGCCTAACCCGGTTAGCCGTGGAjIA^ 



-eGFP.C.e.unc53 



ITLGMOELYKS 



- C.e. unc53 



G I » S T SNVEL IP 1 YTDVANRHL 



Nru I 



GAAGGGCAGCTT ATCAAAG tcgattacggatatttccaa tgattttcgcgactatcgactgg tttc tcagc ttattaatg tga tcgttccg a tcaacgaa 

C TTCCCGTCGAATAGT f TCAGCTAATCCCTATAAAGGTTACTAAAAGCGC TGATAGCTGACC AAAGAGTCGAA TAA TTAC AC TAGCAAGG" TAftTTGrTT 



EcoR I 

I 



■eGFP.C e.unc53 



— — — — — — -n a iin^l . 

5 3 U 5 K S I Bp | S « 0 F R Q y R L V SQLIHV|V»| ME 



WO 98/24810 258/270 PCT/EP97/06956 



Tuesday, 18 November 1997 10:34 

fig 30 pEGFP72 (1 > 9697) Site and Sequence 



Page 3 



Bsm I 

T rCTCGCCTCCATTCACGAAACGTTTGGCAAAAATCACA TCGAACC TGGATGGCCTCGAAACGT6TCTCGACTACCTGAAA AATC TGGGTC TCGACTGCT 
AAGAGCGGACGTAAGTGCTTTGCAAACCGTTTTTAGfGTAGCTTGG ACCTACCGGAGC TTTGCACAGAGC TGATGGACTTTTTAGACCCAG AGC TGACGA 



•C.e. unc53 



F S P A F X K R L A K ITSNLOGLETCLDYLKNLGLOC 



EcoR V 



Ear I 
Ksp632l 



Hind 111 

I 



CGAAACTCACCAAAACCGATATCGACAGCGGAAACTTGGGTGCAGTTCTCCAGCTGCTCTTCCTGCTCTCCACCTACAAGCAGAAGCTTCGGCAACTGAA 



GCTTTGAGTGGTTTTGGCTATAGCTGTCGCCT 



TTGAACCCACGTCAAGAGGTCGAC6AGAAGGACGAGAGGTGGATGTTCGTCTTCGAAGCCGTTGAC 



TT 



I70C 



-C.e. unc53 



S K L T K T 0 1 D S G N L GAVLQLLFLL5TYKQKLRQLK 



3 maCI 
PmaCI 



rmaCl 
Pn 

AAAAGATCAGAAGAAATTGGAGCAACTACCCACATCCATTATGCCACCCGCGGTTTCTAAATTACCCTCGCCACGTGTCGCCACGTCAGCAACCGCTTCA 



TTTTCTAGTCTTCTTTAACCTCGTTGaTGGGTGTAGGTAATACGGTGGGCGCCAAAGATTTAATGGGAGCGGTGCACAGCGGTGCAGTCGTTGGCGAAGT 



*- race 



-eGFP.C.e.unc53 



•C.e. uncS3 



KOQKKLEQLPTS 



MPPAVSKLPSPRVATSATA 



gcaactaacccaaattccaactttccacaaatgtcaacatccaggcttcagactccacagtcaagaatatcgaaaattgat tcatcaaagattggtatca 
cgttgattgggtttaaggttgaaaggtgtttacagttgtaggtccgaagtctgaggtgtcagttcttatagcttttaactaagtagtttctaacc^ 



iso: 



-eGFP.C.e.unc53 



~ — -C.e. unc53 — — — 

atnpnsnfpqm stsrlotpqsriskiosskigi 



Aat H 



AGCCAAAGACGT CTGGACT |"AAACCACCCTCATC ATC AACCACTTCATCAAATAATAC AAAT rCATTCCGTCCGTCGAGCCGTTCGAGTGoC AATAATAA 

tcggtttctgcagacctgaatttggtgggagtagtagttggtgaagtagtttattatgtttaagtaaggcaggcagctcggcaagctcaccgttattatt 22sX 



' C.e. unc53 — ■ — - 

K P * T S G L K P P J S S T TSSNNTNSFRPSSRSS6NNN 



WO 98/24810 259/270 PCT/EP97/06956 
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fig 30 pEGFP72 (1 >9697) Site and Sequence 

Ear I 

jEcoR V Ksp632l Asu II 

tgttggctcgacgatatccacatctgcgaagagcttagaatcatcatcaacgtacagctctatttcgaatctaaaccgacctacctcccaactccaaaaa 

ACAACCGAGCTGCTATAGGTGTAGACGCTTCTCGAATCTrAGTAGTAGTTGCATGTCGAGATAAAGCTTAGATTTGGCTGGATGGAGGGTrGAGGTrTTT 



-eGFP.C.e.unc53 



-C.e. uncS3 



VGSTISTSAKSLESSSTVSSISNLNRPTSQLQK 
. — ... t i — i . . 

^ba I pe I 

CCTTC TAGACCACAAACCCAGCTAGTTCGTGTTGC TACAACTACAA AAATCGGAAGCTCAAAGCTAGCCGC TCCGAAAGCCGTGAGC ACCCC AAAACTTG ' 
GGAAGATCTGGTGTTTGGGTCGATCAAGCACAACGATGTTGArGTTTTTAGCCTTCGAGTTTCGATCGGCGAGGCTTTCGGCACTCGTGGoGTTTTGAAC 



-eGFP.C.e.unc53 



• C.e. uncS3 



PSR PQTQL VR VAT TT K IGS S KLAAPKAVS T P K L 

Bsm I 

i 

CTTCTG TGAAGACTATTGGAGCAAAACAAGAGCCCGATAACAGCGGTGGTGGTGGTGGTGGAATGCTGAAATTAAAGTTATTCAG TAGC AAAAACCCATC 
1 " " 1 111 ' ' ' ■■ » , i . . | f - | i i i i \ i , i ) W*V 

GAAGACACTTCTGATAACCTCGTTTTGTTCTCGGGCTATTGTCGCC ACCACCACCACCACCTTACGACTTTAATTTCAATAAGTC ATCGTTTTTGGGTAG 



-eGFP.C e unc53 



-C.e. unc53 



A S VKTIG AKQ EPQN S G G G GGGMLKL K L F S S K N P 3 

TTCCTCATCGAATAGCCCACAACCTACGAGAAAGGCGGCGGCGGTGCCTCAACAACAAACTTTGT CGAAAATCGCTGCCCCAGTGAAAAGTGGCCTGAAG 
AAGGAGT AGCTTATCGGGTGTTGGATGC TCTTTCCGCCGCCGCCACGG AGTTGTTGTTTGAAACAGCTTTTAGCGACGGGGTCAC TTTTCACCGGACTTC 



- eGFP.C. e.unc53 



-Co. uncS3 



5 S S H S P Q P T R K A A A V PQOOTLSKlAAPVKSGLt 

BstX I Hind ill 

I I 
CCGCCGACCAGTAAGC TGGGAAGTGCCACGTC TATGTCGAAGCTTTGT ACGCCAAAAGTTT CCTACCGT AAAACGGACGCCCC AA TCATAT^ TC AACAAq 

GGCGGCTGGTCATTCGACCCTTCACGGTGCAGATACAGCTTCGAAACATGCGGTTTTCAAAGGATGGCATTTTGCCTGCGGGGrTAGTATAGAGTTGTTC 



-eGFP.C.e.unc53 



— C.e. uncS3 

0?T SKLGSATSMSKLCTPKVSYRKTDA3| ISQO 



WO 98/24810 260/270 PCT/EP97/06956 
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fig 30 pEGFF72 (1 > 9697) Site and Sequence __ 

Ear I 

j<sp632l : BspM II 

ACTCGAAACGATGCTCAAAGAGCAGTGAAGAAGAGTCCGGATACGCTGGATTCAACAGCACGTCGCCAACGTCATCATCGACGGAAGGTTCCCTAAGCAT 

' ' 1 1 1 1 ' 1 1 ' 1 1 1 1 1 1 ■ 1 1 ' ■ ley. 

TGAGCTTTGCTACGAGTTTCTCGTCACTTCTTCTCAGGCCTArGCGACCTAAGTTGTCGTGCAGCGGTTGCAGTAGTAGCTGCCTTCCAAGGGATTCGTA 



eGFP.C.e.unc53 



C.e. unc53 — 

OSKRCSKSSEEESGYAGFNSTSPTSSSTEGSLSP 



Bsm I 
Sph I 
i Ava ill 

I r ' 

GCATTCC ACATCTTCCAAGAGTTCAACGTCAGACGAAAAGTCTCCGTCATCAGACGATCTTACTCTTAACGCCTCCATCGTGACAGCTATC AGACAGCCG 
» ■ ■ I ■ » I < i ■ 1 ■ t 1 — i t i i ... i I i i i 1 ) i , — i i ) 

CGTAAGGTGTAGAAGGTTC TCAAGTTGCAGTCTGCTTTTCAGAGGCAGTAGTCTGCTAGAATGAGAATTGCGGAGGTAGCAC TGTCGATAGTCTGTCGGC 



eGPP.C.e.unc53 



C.e. unc53 — 

HSTSSKSSTSDEKSPSSOOLTLNAS 1VTAIR0P 



psp I 

ATAGCCGCAACACCGGTTTCTCCAAATATTATCAACAAGCCTGTTGAGGAAAAACCAACACTGGCAGTGAAA GGAGTGAAAAGCACAGCGAAAAAAGATC 
TATCGGCGTTGTGGCCAAAGAGGTTTATAATAGTTGTTCGGACAACTCCTTTTTGGTTGTGACCGTCACTTTCCTCACTTTTCGTGTCGCTTTTTTCTAG 



eGFP.C.e.unc53 



~~ — C.e. unc53 

tAATPVSPNI INKPVEEKPTLAVKGVKSTAKKO 



PmaCI 

Pvu II | PmaCI EcoR V 

1 if i 

CACCTCCAGC TGTTCCGCCACGTGACACCCAGCCAACAATCGGAGTTGTTAGTCCAATTATGGCACATAAGAAGTTGACAAATGACCCCGrGATATCTGA 

— — 1 1 ' ' I ■ i i < > ■ ■ t 1 1 i i t — i — »- --z:y 

GTGGAGGTCGACAAGGCGGTGCACTGTGGGTCGGTTGTT AGCCTCAACAATCAGGTTAATaCCGTGTATTC ttcaactgtttactggggcactatagact 



-eGFP.C.e.uncS3 



-C.e. unc53 



PPPAVPPROTOPT IGVVSP imahkkl TNOPV I SE 



Alwn I 

I 

aaaaccagaac'ctga aaagctccaatcaatgagcatcgacacgacggacgttccaccgcttccacctctaaaatcagttgttccacttaaaatgacttca 
ttttggtcttggacttttcgaggttagttactcgtagctgtgctgcctgcaaggtggcgaaggtggagattttagtcaacaaggtgaattttactgaagt 



eGFP.C.e.unc53 



■ — C.e. unc53 ■ ■ • 

k? epeklqsmsidttovpplpplksvvplv:mt3 
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fig 30 pEGFP72 (1 > 9697) Site and Sequence 

Spli 

I f P " 

ArcCGACAACCACCAACGTACGATGTTCTTCTAAAACAAGGAAAAATCACATCGC CTGTCAAGTCGTTTGGATATGAGCAGTCGrCCGCGTCTGAAGACT 
TAGGCTGTTGGTGGTTGCATGCTACAAGAAGATTTTGTTCCTTTTTAGTGTAGCGGACAG TTCAGC AAACC TA TACTCGTC AGC AGGCGCAGAC TTCTGA ~ ' 



-eGFP.C.e.ur.c53 



-C.e. unc53 



! R Q P P T V D V I L K Q G K I T S P V K S F G Y E Q S S A S E D 

CCATTGTGGCTCATGCGTCGGCTCAGGTGACTCCGCCGACAAAAACTrCTGGTAATCATTCGCTGG AGAGAAGGATGGGAAAGAATAAGACATCAGAATC 
GGTAACACCGAG TACGCAGCCGAGTCCACTGAGGCGGCTGTTTTTGAAGACCATTAG TAAGCGACCTCTCTTCCTACCCTTTCTTATTC TG T AG TC TT AG 



•eGFP.C.e.unc53 



-C.e. unc53 



S I V A H A S A Q V T P P MTSGHHSLE RRMGKNKTSE3 

CAGCGGCTACACCTCTGACGCCGGTGTTGCGATGTGCGCCAAAATGAGGGAGAAGCTGAAAG AATACGATGACATGACTCGTCGAGCACAGAACGGCTAT 
GTCGCCGATGTGGAGACTGCGGCCACAACGCTACACGCGGTTTTAC TCCCTCTTCGACTTTC TTATGCTAC TG TACTGAGCAGCTCGTGTC TTGCCGATA ^ 



-C.e. unc53 



S G Y T s 0 A G V A M CAKMREKLKEYDDMTRRAQNGY 



CCTGACAACTTCGAAGACAGTTCCTCCTTGTCGTC TGGAATATCCGATAACAACGAGCTCGACGACATA TCCACGGACGATTTGTCCGGAGTAGAC ATGu 
GGACTGTTGAAGCTTCTGTCAAGGAGGAACAGCAGACCTrATAGGCTATTGTTGCTCGAGCTGCTGTATAGGTGCCTGCTAAACAGGCCTCATCTGTACC 



-C.e. unc53 



P 0 N F E 0 S S S L S S G I S 0 M N E L D 0 I STDDLSGVDM 

CAACAGTCGCCTCCAAACATAGCGAC tattcccactttg ttcgccatccc acgtcttcttcc TCAAAGCC CCGAGTCCCCAG TCGGTCC tcc ac atcag* 
g:tgtcagcggaggtttgtatcgctgataagggtgaaacaa g cggtagggtgcagaa gaaggagtttcggggc tcaggggtcagccaggaggtgtagtca 

-e GFP.C.e.ur,c5 3 

' — C.e. unc53 ., ■■ 

A T V A s K hsoyshfvrhptsssskprvpsrss TS v 



WO 98/24810 262/270 PCT/EP97/06956 



■Tuesday. 18 November 1997 10:34 p age 7 
fig30pEGFP72 (1 >9697) Site and Sequence . ~_ 

Bgl I 
j Nar I 

^ho I ! j pbe I 

CG ATTCT CGATCTCGAGCAGAACAGGAGAATGTGTACAAACTTCTGTCCCAGTGCCGAACGAGCCAACGTGGCGCCGC TGCCACC TCAACC TTCGGACAA 

111,1 — 1 1 ' ' ' 1 1 ' ' ' ■ ' I » • i *. V V 

GCTAAGAGCTAGAGCTCGTCTTG TCCTCTTACACATG TTTGAAGAC AGGGTCACGGC TTGCTCGGT TGCACCGCGGCGACGG TGGAGTTGG AAGCC TGT~ 



-eGFP.C.e.unc53 



-C.e. unc53 



O SR SRA EQ ENVYK LL S QC R TSQRGAAAT5TFGQ 

Xma I 5p 0 | 

| jSma I p vu It j Sal I 

CATTCGC TAAGATCCCCGGG ATACTCATCCTATTCTCCACAC rTATCAGTGTCAGCTGATAAGGACACAATGTCTATGCACTCAC AGACTAG TCGACGAC 

"" ' ' ' ' ' ' * 1 ' ..... ■>■■■■ 4 ■ i . 1 . i t ■ i ■ j t ■ , j J j i i | i n i i | 

GTAAGCGATTCTAGGGGCCCTATGAGTAGGATAAGAGGTGTGAATAGTCACAGTCGACTATTCCTGTGTTACAGATACGTGAGTGTCTGArCAGCTGCTG 



-eGFP.C.e.unc53 



- C.e. uncS3 



H SLRSPGYS5Y5PHLSV5ADK0THSHHS0TSRR 



CTTCTTCACAAAAACCAAGCTATTCA GGCCAATTTCATTCACTTGATCGTAAATGCCACCTTCAAGAGTTCACATCCACCGAGCACAGAArGGCGGCTCT 
1 1 1 1 1 1 ■ > < i i i i i i i i i i [ 

GAAGAAGTGTTTTTGGTTCGATAAGTCCGGTTAAAGTAAGTGAACTAGCATTTACGGTGGAAGTTCTCAAGTGrAGGTGGCTCGTGTCTTACCGCCGAGA 



-eGFP.C.e.unc53 



-C.e. unc53 



P S S Q K P S Y S G Q F H S L 0 R K C H L Q E FTSTEHRMAAL 

jBam HI 

CTTGAGCCCGAGACGGGTGCCGAACTCGATGTCGAAATATGATTCTTCAGGATCCTACTCGGCGCGrTCCCGAG GTGGAAGCTCTACTGGTATCTATGGA 
GAACTCGGGCTCTGCCCACGGCTTGAGCTACAGCTTTATACTAAGAAGTCCTAGGATGAGCCGCGCAAGGGCTCCACCTTCGAGATGACCATAGATACC ^ 



-C.e. unc53 



LSPRRVPNSMSKYDSSGSYSARSRGGSST 



G I Y G 



0am HI Nhe I Nde I 

GAGACGT TCCAACTGCACAGAC TATCCGATGAAAAA rcCCCCGCACAf TC TGCCAAAAGTGAGATGGGA TCCC AAC TATCAC TGGCTAGCAwGACAGC A" 
CrCTGCAAGGTTGACGTGTCTGATAGGCTACTTTTTAGGGGGCGTGTAAGACGGTTTTCACTCTACCCTAGGGrTGArAGrGACCGATCGTGCTGTCGTA 



-eGFP.C.e.unc53 



"~ — -C.e. unc53 ■ 

I T F OLHRLSDEKSPAHSAKScMGSOLSLAS 7TA 
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PmaCI 
I PmaCI 



Sal I 



atggatctctcaatgagaagtacgaacatgctattcgggacatggcacgtgacttggagtgttac aagaacactgtcgactcactaaccaagaaacaggh 

TACCTAGAGAGTTACTCTTCATGCTTGTACGATAAGCCCTGTACCGTGCACTGAACCTCACAATGTTCTTGTGACAGCTGAGTGATTGOTTCTTTGrCC" 



IX 



-C.e. unc53 



YGSLNEKYEHA! ROMAROLEC YKNTVDS 



L T K K Q E 



Ear I 

Hind IN pia I Ksp632l 

GAACTATGGAGCArTGTTTGATCTTTTTGAGCAAAAGCTTAGAAAACrCACTCAACACATTGATCGATCC AACTTGAAGCCTGAAGAGGCAATACGATT: 
CTTGArACCTCGTAACAAACTAGAAAAACTCGTTTTCGAATC TTTTGAGTGAGTTGTGTAAC TAGCTAGGTTGAACTTCGGACTTCTCCGTTATGC TAAG 



-eGFP.C.e.unc53 



-C.e. unc53 



N Y G ALFOIFEQKLRKLTQHIORSNLKPEEAIRF 
1 ■■■■■»■■ ' . » > . . . . ■ ... ^ ..... . .... 

aggcaggacattgctcatttgagggatattagcaatcatcttgcatccaactcagctcatgctaacgaaggcgctggtgagcttcttcgtcaaccatct: 



TCCGTCCTGTAACGAGTAAACTCCCTATAATCGTTAGTAGAACGTAGGTTGAGTCGA GTACGATTGCTTCCGCGACCACTCGAAGAAGCAGTTGGTAGA3 

-eGFP.C.e.unc53 



-C.e. uncS3 



qQD I A H L R Q t s NHLASNSAHANEGAGELLRQP3 



Cla I pia I 

i 



Sst I 



Ear 1 
|<sp632l 



TGGAATC AGTTGCATCCCATCGATCATCGATGTC^ 
ACCTTAGTCAACGTAGGGTAGCTAGTAGCTACAGTAGCAGCAGCTTTTCGTCGTTCGTCCTCTTCTAGTCGAACrCGAGCAAACCGTTCTTGTrCTTCT 



-C.e. unc53 



LE SVASH RSS MSSSSKS SKQ EK I SLSSFGKN K K 3 

P am Hl jside I BspM II 

CrGGArcCGCTCCTCACTCTCCAAGTTCACCAAGAAGAAGAACAAGAACTACGACGAAGCACATArGCCATC AATTTCCGGATCTCAAGGAACTCTTGA: 
GACCTAGGCGAGGAGTGAGAGGTTCAAGTGGTTC7TCTTCTTGTTCTTGATGCTGCT TCGTGTATACGGTAGTTAAAGGCCTAGAGTTCCTTGAGAACTL: 

-eGFP.C.e.ur.c53 




IRSSLSKFTKK 



— — C.e. unc53 — • — — 

KNKNYDEAHMPSI3GSQ3TLD 



WO 98/24810 264/270 PCT/EP97/06956 
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Sst I ApaL I 

AAC ATTGATGTGATTGAGTTGAAGCAAGAGCTCAAAG AACGCGATAGTGCACTTTACGAAGTCCGCC TTGACAATCTGGATCG 7GCC CGC5 AAGTTGAT3 
TTGTAACTACACTAACrCAACTTCGTTCTCGAGTTTCTTGCGCTATCACGTGAAATGCTTCAGGCGGAACTGTTAGACCTAGCACGGGCGCTTCAACTA: ^ 



-eGFP.C.e.unc53 



-C.e. unc53 



N 1 D V 



E L * Q E. I K E R D S A L Y E V RLDNLDRAREVD 

TTC 1"GAGGGAGACAGTGAACAAGTTGAAAACCGAGAACAAGC AATTAAAGAAAGAAGTGGACAA ACTCACC AACGGTCCAGCCAC TCGTGCTTCTTCCC2 
AAGACTCCCTCTGTCACTTGTTCAACTTTTGGCTCTTGTTCGTTAATTTCTTTCTTC ACCTGTTTGAGTGGTTGCCAGGrCGGTGAGCAC3AAGAAGGG= ^ 

-eGFP.C.e.unc53 



-C.e. unc53 



V L R £ T V N K L K T E N K Q L K K E V D K L T N G P A T R A S S P 

F" PI f SM ASU II 

CGCC TCAATTCCAGTTATC TAC6ACGATGAGCATGTCTA TGATGCAGCGTGTAGCAGTACATCAGCTAG TCAA TCTTCGA AACGATCCTCTGGC TGCAAC 
GCGGAGTTAAGG TCAATAGATGC TGCTACTCGTACAGATACTACGTCGCACATCGTCATG TAGTCGATCAGTTAGAAGCTTrGCTAGGAGACCGACGTTG ^ 



-C.e. unc53 



A , S P V 1 , Y D 0 I H V V 0 A A C S5TSASQ5SKRSSGCN 



Fvu I 

j ,Hpa I HcoR v 

TCAATCAAGGTTACTGTAAACGTGGACATCGCTGGAG AAATCAGT 

AGTTAGTTCCAATGACATTTGCACCTGTAGCGACCTCTTTAGTC AAGCTAGCAATTGGGCCTGTTTCTCTATTAGCATCC TATAGAACGGTACAGTTG'j~ 



-eGFP.C.e.unc53 



-C.e. unc53 



' K , V T V W VDIAGEISSIVNPDKEI1V 



G Y L A M S T 



Cla I 



G r CAGTCATGCTGGAAAGACATTGATGTrTCTATTCrA3GACTATTTGAAGTCTACCTATCCAGAATTGA 

CAgTCAGTAC GACCTTTCTGTAACTACAAAGATAAGATCCTGATAAACTTCAGATGG ATAGGTCTTAACTACACCTCGTAGTTGAACCTTASCTACGAGC 

-eGFP.C.eunc53 



" — C.e. unc53 

SQSCWKQ i Dv/ S | LG LFEVYLSR IDVEHQLGIDAP 
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Mlu I 

i 
I 

tgattcratccttggcratcaaattggrgaacttcgacgcgtcattggagactccacaa ccatgataaccagccatccaactgacattcttacttcctca 
actaaga taggaaccgatagtttaaccacttgaagctgcgcagtaacc tctgaggtgttggtactattggtcggtaggttgac tgtaagaatgaacgag* 



-eGFP.C e.unc53 



-C.e. unc53 



DSILGYQ IGELRRV I GOSTTM I TSHPTO IL T 



S 3 



actacaa tccgaatgttcatgcacggtgccgcacagag tcgcgtagacag tctggtccttgatatgcttcttccaaagcaaatgattctcc aactcgtca 

TGATGTTAGGCTTACAAGTACGTGCCACGGCGTGTCTCAGCGCATCTGTCAGACCAGGAACTATACGAAGAAGGTTTCGTTTACTAAGAGGTTGAGCAGr * 2C * 



-eGFP.C.e.unc53 



-C.e. unc53 



T T j * M F M HGAAQSRVOSLVLOMLLPKQMILQLV 



j^at II psrl psrl Asu II 

AGTCAATTTTGACAGAGAGACGTCTGGTGTTAGCTGGAGCAACTGGAATTGGAAAGAGCAA ACTGGCGAAGACCCTGGCTGC TTATGTATCTATTCGAAC 
TCAGTTAAAACTGTCTCTCTGCAGACCACAATCGACCTCGTTGACCTTAACCTTTCTCGTTTGACCGCTTCTGGGACCGACGAATACATAGATAAGCTT^ MU 



-C.e. unc53 



K S ILTERRLVLAGATG IGKSKLAKTL 



A A Y V S ( R T 



Bsm I |(mn I p g i n 

AAATCAATCCGAAGATAGTArTGTTAArATCAGC ATTCCTGAAAAC AATAAAGAAGAATTGC TTCAAG TGGAACGACGCCTGGAAAAGATCTTGAG AAG: 
TTTAGTTAGGCTrCTATCATAACAATTATAGrCGTAAGGACTTTTGTTATTTCTTCTTAACGAAGTTCACCTTGCTGCGGACCTTTTCTAGAACTCrTCG ^"^ 



-C.e. unc53 



N Q S . E 0 S J V N I S I P E N N K E E L L Q V £ R RLEKILP3 
Ava III 

Nsi I fba I 

AAAGAATC A TGC ATCG TAA TTCTAGATAATATCCC AAAGAATCGAATTGCATTTGTTG TATC CGTTTTTGC AAATGTCCC AC TTC AAAAC AACGAAGGTC 
TTTC TTAGTACG TAGC ATTAAGATCTAfTA TAGGG TTTC TTA GCTTAACGTAAACAA CATAGGCAAAAACG TTTACAGGG TGAAG TTTTGTTGC TTCC A3 

-eGFP.C.e.urtc53 



C.e. unc53 — — 

* E S , C 1 V j L 0 M I P K N RIAFVVSVFANVPLONNEG 
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EcoR V 



CATTTGTAGTATGCACAGTCAACCGATATCAAATCCCTGAGC TTCAAATTCACCACAATTTCAA AATGTCAGTAATGTCGAArCGTCTCGAAGGATTCA~ 
GTAAACArCATACGrGTCAGTTGGCTATAGTTTAGGGACTCGAAGTTTAAGTGGTGTTAAAGTTTTACAGTCATTACAGCTTAGCAGAGCTTCCTAAGTA 



-eGFP.C.e.unc53 



-C.e. unc53 



P F V V C T V N R Y Q I P E L Q IHHNF KMSVMSNRLEGF [ 



Ear I 



3st I Ksp632l 

i ! 

CCTACGTTACCTCCGACGACGGGCGGTAGAGGATGAG TATCGTCTAAC TGTACAGATGCCATCAGAGCTCT TC AAAATCATTGAC TTCTTCCCAATAGCT' 
GGATGCAATGGAGGCTGCTGCCCGCCATCTCCTACTCATAGCAGATTGACATGTCTACGGTAGTCTCGAGAAGTTTTAGTAACTGAAGAAGGGTTATCGA 



-eGFP.C.e.unc53 



-C.e. unc53 



U R Y L R R R A V E 0 E Y R L T V Q MPSELFKIlOFFPlA 



Ear I 
Ksp932l 



Eco 



Sph I 



Bam HI 



CTTCAGGCCGTCAATAATTTTATTGAGAAAACGAATTCTGTTGATG TGACAGTTGGTCCAAGAGCAT GCTTGAAC TGTCC TC TAACTGTCG4TGGA TCCT 
GAAGTCCGGCAGTTATTAAAATAACTCTTTTGCTTAAGACAACTACACTGTCAACCAGGTTCTCGTACGAACTTGACAGGAGATTGACAGCTACCTAGGC 




LQA , VNNF 1EK THS VD VTV GPRACLNCPLTVOGS 

GTGAATGGTTCATTCGATTGTGGAATGAGAACTTCArTCCATATTTGGAACGTGTTGCTAGAGATGGCAAAAA AACCTTCGGTCGCTGCACTTCCTTCGA 
CACTTACCAAGTAAGCTAACACCTTACTCTTGAAGTAAGGTATAAACCTTGCACAACGATCTCTACCGTTTTTTTGGAAGCCAGCGACGTGAAGGAAGCT ^ 



-eGFP.C.e.unc53 



-C.e. unc53 



RE VF1RL , VNE NF ^Py UERVAROGKK TFGRCTSFE 



3am HI 



Tth I 



Tth I 




-C.e. unc53 



0 P T 0 1 v J * * W P V F 0 G ENPENVLKRLQLOOLVP 
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BspM I ^<ho I Sph I 

CCTGCCAACTCATCCCGACAACACTTCAATCCCCTCGAG TCGTTGATCCAATTGCATGCTACCAAGCATCAGACCATCGACAACATTTGAACAGAAGACT 

* ' ' * ' ' '"*" ' ♦ ' 1 ' ■ ■ ■ > ■ ■ - ! ■■ •- - t i I | — I ■ i 1 | .... i n j i i0 iii | .3 I -\' 

GGaCGGTTGAGTAGGGCTGTTGTGAAGTTAGGGGAGCTCAGCAaCTAGGTTAACGTACGATGGTTCGTAGTCTGGTAGCTGTTGTAAACTTGTCTTCTGh 



-eGFP.C.e.unc53 . J 



>. 



1 C.9. unc53 I 

p A N S S R Q H F N P L £ S L I QLH A T K H Q T I D N I . T E P 

Asp 718 
j J<pn I 

ctaatcttctctcgcctctcccccgctttccttatcttcgtaccggtacctgatgattccccattt tcccccttttccccccaatttcccagaacct^ 

GATTAGAAGAGAGCGGAGAGGGGGCGAAAGGAATAGAAGCATGGCCATGGACTACTAAGGGGTAAAAGGGGGAAAAGGGGGGTTAAAGGGTCTrGGAGGA 
S N L L S P L P R F P Y L R T G T . F P I F P LFPP ISQNLL 

Xma I 

pma I pra I *mn I 

GTTCCCTTTGTTCCTAGTCCTCCCGGGTGCCGACGCCGAAGCGATTTAAAAACCTTTTTCTTTCCGA AACATTTCCCATTGCTCATTAATAGTCAAATTG 
CAAGGGAAACAAGGATCAGGAGGGCCCACGGCTGCGGCTTCGCTAAAfTTTTGGAAAAAGAAAGGCTTTGTAAAGGGTAACGAGTAATTATCAGTTTAAC 
F p I F L V L P G A D A E A I K P FSFRN ISHCSLI VKL 

Apa I 
Sma I 

J |3am HI } Xba I Bel I 

aataaacagtgtatgtacttaaaaaaaaaaaaaaaaaaactcgagggggggcc'cgggatccaccggatc tagataactgatcataatcagccataccaca 

TTATTTGrCACATACATGAATTTTrTTTTTTTTTTTTrTGAGCTCCCCCCCGGGCCCTAGGTGGCCTAGArCTATTGACTAGTATTAGTCGGTATGGTGT 
N K Q C " Y L KKKKK .<LEGGPGIHR| . ITDHNQPYH 



<ho I 



P ra 1 0sm I Hpa I 

1 i 

TTTGTAGAGGTTTTAC TTGCTTTAAAAAACCTCCCAC ACCTCCCCCTG AACCTGAAAC ATAAAATGAATGC AATTGTTG TTGTTAAC TTGTTTATTGCA3 
AAACATCTCCAAAATGAACGAAATTTTTTGGAGGGTGTGGAGGGGGACTTGGACTTTGTATTTTACTTACGTTAACAACAACAATTGAACAAArAACGT: 
1 C R G F T C . f < < ,P P T P P p E P £ T N E C NCCC . L V Y C 3 

Bsm I 

C TT AT AATGG TTACAA ATA AAGCAATAGCATC AC AAA TT TCACAAATAAAGCATTTTTTTCACTGC A TTCT AG TTGTGGTTTGTCCAAACTC ATCAATG7 



GAATArTACCAATGTTTATTTCGTTATCGTAGTGTTTAAAGTGTTTATTTCGTAAAAAAAGTGACGTAAGATCAACACCAAACAGGTTTGAGTAGTTACA 
L • v L 0 ' * Q ■ H H K F H K , S I F FTAF . LWFVQTHQC 

MIU I p S p I 

ATCTTAACGCGTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCrCATTTTTTAA CCAATAGGCCGAAATC^^^ 
rAGAATTGCGCATTTAACATTCGCAATTATAAAACAATTTTAAGCGCAATTTAAAAACAATTTAGTCGAGTAAAAAATTGGTTATCCGGCTTTAGCCGTT °' 
1 L T * * I ; A LIFC N S R IFVKSAHFLTNRPKSA 
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Bsrt 

I 

AArCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTT GAGTGTTGTTCCAGTTTGGAACAAGAGTCCAC TATTAAAGAACGTGGACTCCAACGTCAAA 

1 ' ' 1 i i ' • • 1 1 ' 1 ) ■ i > i i i i 55 

TTAGGGAATATTTAGTTTTCTTATCTGGCTCTATCCCAACTCACAACAAGGTCAAACCTTGTTCTCAGGTGATAATTTCTTGCACCTGAGGTTGCAGTTT 

KSL I N 0 K N R P R G VLFQFGTRVHY . R T V T p t s » 

Dra 111 

1 

GGGCG AAAAACCGTCTATC AGGGCGATGGCCCACTACGTG AACCATCACCCTAATCAAGTTTTTTGGGGTCGAGGTGCCG TAAAGCACTAAATCGGAACC 

' 1 > 1 1 1 1 1 I 1 ' * 1 ' 1 * 1 i i . i i i i ssc.; 
CCCGCTTTTTGGCAGATAGrCCCGCTACCGGGTGATGCACTTGGTAGTGGGATTAGTTCAAAAAACCCCAGCTCCACGGCATTTCGTGATTTAGCCTTGG 

G£<PS IRAMAHYVNHHPNOVFVGRGAVKh' . IGT 

Nae I 

CTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGC 

' 1 — ' ' ' ' 1 1 1 1 1 1 1 1 1 ' ' i ' i » I 700C 

GATTTCCCTCGGGGGCTAAATCTCGAACTGCCCCTTTCGGCCGCTTGCACCGCTCTTTCCTTCCCTTCTTTCGCTTTCCTCGCCCGCGATCCCGCGACCG 

L K G A P 0 L E L 0 G E S R R T V R E R KGRKRKERALGRV 

J<spl j<spl 
AAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCGCTTAATG CGCCGCTACAGGGCGCGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGA 
TTC ACATCGCCAGTGCGACGCGCATTGGTGGTGTGGGCGGCGCGAATTACGCGGCGATGTCCCGCGCAG TCCACCGTGAAAAGCCCCTTTACACGCGCCT ' 
QV - RS RC A p p H P P RLMRRYRAROVALFGEMCAE 

Ear I 

0spH I jSsp I Ksp632l 

ACCCCTATTTGTTTATTTTTC TAAATACATTCAAATATG TATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATAT TGAAAAAGGAAGAGTC 
TGGGGATAAACAAATAAAAAGATTTATGTAAGTTTATACATAGGCGAGTACTCTGTTATTGGGACTATTTACGAAGTTATTATAACTTTTTCCTTCTCAo 
PLFVYFS KY10IC IRS .DNNPDKCFNN I EKGRV 



OxaN I Pvu II 



Sph I 
Ava III 
Nsi I 

! 

CTGAGGCGGAAAGAACCAGCTGTGGAArGTGTGTCAGTTAGGGTG 

GACTCCGCCTTTCTTGGTCGACACCTTACACACAGTCAATCCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACGTAGAGTTA 
L R R K E P A V E C V $ V RVVKVPRLPSRQK YAKHASQ 

.Sph I 
I Ava III 

| r . 

TAGTC AGCAACCAGGTG tggaaagtccccaggctccccagcaggcagaagtatgcaaagc atgcatctc aat tagtcagc aaccatagtcccgcccct AA 

ATCAGTCGTTGGTCCACACCTTTCAGGGGTCCGAGGGGTCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAATCAGTCGTTGGTATCAGGGCGGGGATT 
L 7 S N Q V v * V P R I P S R Q K Y A K HASOLVSNHSPAPrJ 

B 9 ) 1 

jBsrl ^Nco I sfi I 

C ^CGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCC ATTCTCCGCCCC ATGGCTGACTAATTTTTTTTA TTTATGCAGAGGCCGAGGCCGCCTCGGC 
GAGGCGGGTAGGGCGGGGATTGAGGCGGGTCAAGGCGGGTAAGAGGCGGGGTACCGACTGATTAAAAAAAATAAATACGTCTCCGGCTCCGGCGGAGCCG 
5 A H P A P N SAQFRPFSAPVLTMFFYLCRGRGRLG 
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Stu I 

j Avr II cia I 

CTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAGATCGATCAAGAGACAGGATGAGGATCGTTTCGCArGATTGA 



GAGACTCGATAAGGTCTTCATCACrCCTCCGAAAAAACCTCCGGATCCGAAAACGTTTCTAGCTAGTTCTCTGrCCTACTCCTAGCAAAGCGTACTA^CT ^ 
I ■ AI P SVVRR LFV RP RLLQRS IKRQDEDRFA . t 

pspM I |(ma III 

ACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTT GGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTG 

1 ' i ■ t i i i [ i | i i i i i i i| , [ 770- 

TGTTC TACCTAACGTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTG TC TGTTAGCCGACGAGACTACGGCGGCAC 
M K M D C T Q V L R P L G V R G Y S A M T G HNRQSAALMPPC 

Nar I 

| p bB 1 Kspl 
TTCCGGC TGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAAC TGCAAGACGAGGCAGCGCGGCTATCGTGGC 
AAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCTGGCTGGACAGGCCACGGGACTTACTTGACGTTCTGCTCCGTCGCGCCGATAGCACCG 
S G c , Q ft RGARFFLSRPTCPVP MNCKTRQRGYRG 



Fspl 

I r 



Pal I j p V u II Jth I 

TGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCGGGAAGGG ACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATC TCCT 
ACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGACTTC6CCCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGA *** 
tf P R R A F V A 0 L C S T L SL K R E G T G C YVAKCRGR IS 

espM i 

GTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATC ATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTA CCTGCCCATTCGACCACCAA 
CA3TAGAGTGGAACGAGGACGGCTCTTTCATAGGTAGTACCGACTACGTTACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCTGGTGGTT 
CH , LTLLL PRK YPSVL MQ CGG C IRL IRLPAHSTT* 

Ear I 
j<sp632l 

GC3AAACATCGCArCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCA6GGGCTCG CGCCAGCCGAAC 

cgc tttgtagcgtagctcgctcg tgcatgagcctacc ttcggccagaacagctag tcctactagacctgcttc tcgtagtccccgagcgcggtcggcttg 3,C ' : 

* N A S S E H V L G V K P V L S I R M IV T K S IRG5RQPN 

jSph I jg C o I 

TGT TCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGATCTCGTCG TGACCCATGGCGArGCCTGCTTGCCGAATATCATGGTGG AAAATGGCCGC TT 
ACAAGCGGTCCGAGTTCCGCTCGTACGGGCTGCCGCrCCTAGAGCAGCACTGGGTACCGCTACGGACGAACGGCTTATAGTACCACCTTTTACCGGCGAA ^ 
C S P G S R R A C P T A R [ S S P M A M P A C R [ S V V K M A A 

• | _ Ear I 

, Nae 1 P sr » Ksp632l 

TTCrGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCT GAAGAGCTTGGCGGCGAATGU 
AA3ACCT AAGTAGC TGACACCGGCCGACCCACACCGCC TGGCGATAGTCC TGTATCGCAACCGATGGGCAC TATAACGAC TTC TCGAACCGCCGC T TACC ^ 
r L ° S S T V A G W V V p TA ! RT .RVLPV 1 LLK3LAAMG 
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GCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCrTGACGAGTTCrTCTGAGCGGGAC TCTGG3 

1 • — 1 ■- ' ' 1 1 1 1 ■ ' i » * » i ■ k 3-c»: 

CGACTGGCGAAGGAGCACGAAATGCCATAGCGGCGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAAC TGCTCAAGAAGACTCGCCC TGAGACCC 
LTASSCFTVSP LPIRSASPS1AFLTSSSER0SG 

A.su || BspM I 

GTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGG 

11 1 ' ' 1 ' 1 * 1 ' ' * ' i > . . > i i 350' 

CAAGCTTTACTGGCTGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCC 

VRWORPSOAQPA ITRFRFHRRL L . KVGLRNRFP 



Nae I 



<spl Avr II 



GACGCCGGCTGGATGATCC TCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCTAGGGGGAGGCTAACTGAAACACGGAAGGAGACAATACCGG 

' — • sH • 1 » < I . i I . . h 56C< 

CTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGATCCCCCTCCGATTGACTTTGTGCCTTCCTCTG TTATGGCC 

GRRLDOPPARGSHAGVLRPP G E A N . NTEGONTG 

■ - ■ ■ ■ ■ ■ - » ■ • ■ « , ■ , . , „ . .... 

|<6pl 

AAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACGCACGGTGTTGGGTCGTTTGTTCATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTC 
• 1 1 , 1 1 1 » ' 1 ' ♦ » 1 1 1 1 i .i.i 370< 

TTCCTTGGGCGCGATACTGCCGTTATTTTTCTGTCTTATTTTGCGTGCCACAACCCAGCAAACAAGTATTTGCGCCCCAAGCCAGGGTCCCGACCGTGAG 
R N P R Y 0 C N K K T E . N A RCVVVCS.TRGSVPGIAL 

TGTC GATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCA 

1 1 1 ' 1 1 i ■ ■ I i ■ ■- i ... i. I ■- — ■ i f . | . | , > ■ . i i . 1 ■ r | SCOC 

ACAGCTATGGGGTGGCTCTGGGGTAACCCCGGTTATGCGGGCGCAAAGAAGGAAAAGGGGTGGGGTGGGGGGTTCAAGCCCACTTCCGGGTCCCGAGCGT 
C R Y P T £ T P L G P 1 R P R F FLFPTPPPKFG . RPRAR 

j^lwn I pxaN I p ra I pra I 

GCCAACGTCGGGGCGGCAGGCC CTGCCATAGCCTCAGGTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGA 

1 1 1 -■ ■ t i i i i j | i i i i i r i i | i i i i i i | i i i i i i - | 33 C ( * 

CGGTTGC AGCCCCGCCGTCCGGG ACGGTATCGGAG TCCAATGAGTATATATGAAATCTAACTAAATTTTGAAGTAAAAATTAAATTTTCCTAGATCCACT 

SQ RRG GR PCH SLRLL fY TLO F K T S F L I .KOLGE 

pspH I 

AGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGA 



TCTAGGAAAAAC tattagagtac tggttttagggaattgcac tcaaaagcaaggtgactcgcagtctggggcatcttttctagtttcctagaagaactct 

0 P F • SHDONPLT . VFVPLSVRPRRKDQR I FLR 



TCCTTTT TTT CTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGC TACCAGCGGTGG TTTGTTTGCCGGATCAAGAGC TACCAACfCTTTTTCCS 

' ' ' 1 ' 1 ' » i * — > ■ i i i i t ■ i HOC 

AGGAAAAAAAGACGCGCATTAGACGACGAACGTTTGTTTTTTTGGTGGCGATGGTCGCCACCAAACAAACGGCCTAGTTCTCGATGGTTGAGAAAAAGGC 

S F F S A R N L L L A N K K T T A T S G G L F A GSRATN5FS 

Bsrl 

AAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTG TCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACAT 

1 11 1 i i ■ i ,» > . ■ ■ ■ > ■■■■ ■ ■I ■ \ , . j . | i i i i i i | t - i i "l^'**" 

TTCCArTGACCGAAGTCGTCTCGCGTCTATGGTTTATGACAGGAAGATCACATCGGCATCAATCCGGTGGTGAAGTTCTTGAGACATCGTGGCGGATGTA 
E G NW LOQSAOTKYCPSSVAVVRPPLQELCSTAY I 
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